COSC4307

 

Relations on Grammar Symbols

1. Leftmost-Deriving: L+nv  

2. Rightmost-Deriving:         R+nv   

3. Adjacent:                           Avv   

4. Next-to:                              Ntt   

 

1.      Leftmost-Deriving.

          Each nonterminal deriving some grammar symbols as the

          leftmost symbol (or the first) in one or more steps.

          L+nv    =  L*nn  *  Lnv

 

          Example-Grammar

 

          E -> E + T

          E -> T

          T -> T * F

          T -> F

          F -> ( E )

          F -> id

 


 

                   Lnv

E

T

F

+

*

(

)

id

E

1

1

0

0

0

0

0

0

T

0

1

1

0

0

0

0

0

F

0

0

0

0

0

1

0

1

 

           ß-- Lnn ---àß------- Lnt -------à

 

 

  L* nn  =  L0 nn  +   L1nn   +    + L n-1 nn 

 

               

                  1  0  0           1  1  0            1  1  1              1  1  1

            =     0  1  0    +   0  1  1    +    0  1  1     =        0  1  1

                        0  0  1         0  0  0            0  0  0              0  0  1

 

          L+ nv    =  L* nn  *  L  nv  

  

                             1  1  1         1  1  0  0  0  0  0  0

             =    0  1  1  *    0  1  1  0  0  0  0  0

                             0  0  1        0  0  0  0  0  1  0  1

 

                             1  1  1  0  0  1  0  1

                    =       0  1  1  0  0  1  0  1

                             0  0  0  0  0  1  0  1

 

          From above L+ nv , we see that, among others,

          The nonterminal E can leftmost-derive five symbols,

E,T,F, id, and ( while it does not leftmost-derive three other symbols

in any number of steps.

    

 

 


 

2.      Rightmost-Deriving.

 

            Each nonterminal deriving some grammar symbols as

          the rightmost (or the last) symbol in one or more steps.

         

          R+ nv  =  R* nn  *  R nv

 

          Just like  L*  nn ,

          R* nn  =   R0 nn  +  R1 nn  +  R2 nn  +    +  Rn-1 nn    

 

          For our example grammar,

         

          R nv looks like:            

E       T       F       +       *        (        )        id

---------------------------------------------------------

                   E       0       1       0       0       0       0       0       0

                   T       0       0       1       0       0       0       0       0

F       0       0       0       0       0       0       1       1

 

And, R*nn   is:              

                                      

1  0  0          0  1  0            0  0  1              1  1  1

            =     0  1  0    +   0  0  1    +    0  0  0     =        0  1  1

                        0  0  1         0  0  0            0  0  0              0  0  1

 

And, R+ nv  =  R* nn  *  R nv  

 

                         1 1 1       0 1 0 0 0 0 0 0

                   =    0 1 1 *    0 0 1 0 0 0 0 0

                         0 0 1       0 0 0 0 0 0 1 1

 

                         0 1 1 0 0 0 1 1                   

                   =    0 0 1 0 0 0 1 1

                         0 0 0 0 0 0 1 1

 

 

From above, we can see that, nonterminal E rightmost-derives four symbols, T, F, ), and id while it does not rightmost-derive the other four symbols, E, +, *, and ( in any number of steps.

 

3.      Adjacency.

 

          This relation is about whether or not any two grammar symbols do

          appear adjacently on the righthand side of any production rule of the

grammar.

 

For example, in the example grammar under consideration,

                    E -> E + T

                   E -> T

                   T -> T * F

                   T -> F

                   F -> ( E )

                   F -> id

 

we see that each of the following two symbols are adjacent in that

given direction.

 

          E  +

          +  T

          T  *

          *   F

          (   E

          E  )

    

          Note that this relation is not symmetric. That is, while  “E +” are

          adjacent,  “+ E” may not be and in fact, they are.

 

          This relation can be represented by a matrix of size v-by-v (where v is

the number of all grammar symbols).

          We call this matrix “A”

 

          A vv looks like in our example case

 

         

E

T

F

+

*

(

)

id

E

0

0

0

1

0

0

1

0

T

0

0

0

0

1

0

0

0

F

0

0

0

0

0

0

0

0

+

0

1

0

0

0

0

0

0

*

0

0

1

0

0

0

0

0

(

1

0

0

0

0

0

0

0

)

0

0

0

0

0

0

0

0

id

0

0

0

0

0

0

0

0

 

4.      Next-to

 

In this relation, we are interested in knowing whether or not any two terminal symbols of the given grammar can possibly appear next to each other in any sentence that can be derived by the grammar.

Clearly, the source program has a syntax error if two terminals happen to appear next to each other when they are not supposed to.

On the other hand, even when every two consecutive terminals appearing in a source program can so appear, the source program may not be free of syntax errors. So, a matrix N of size t-by-t representing this relation shows a minimum syntax requirement that any syntactically correct program must satisfy.

 

This matrix can be callculated as follows:

Ntt   =  Att   + Atn L+nt  +  (R+nt)T Ant +  (R+nt)T Ann L+nt           

 

We observe that two terminals t1 and t2 are a next-to pair if

1.    they are adjacent (since once two terminals are derived adjacently they remain adjacent and hence next-to each other up to the end), which is what the first term above is for or

2.    t1 and a nonterminal P are adjacent in that order and t2 can be leftmost-derived by this nonterminal P in one or more steps, which is what the second term of the above formula is for or

3.    a certain nonterminal P and t2 are adjacent in that order and t1 can be rightmost-derived by this nonterminal P in one or more steps, which is what the third term of the above formula is for, or

4.    two nonterminals P and Q are adjacent in that order and t1 can be rightmost-derived by the first nonterminal P of the two and t2 can be leftmost-derived by the second nonterminal Q of the two, which is what the last term of the above formula is for.

 

We will calculate this matrix for our example grammar.

 

1.  The first term is zero. 

2.  Second term: Atn L+nt                                              

Atn                                                                                             

0

1

0

0

0

1

1

0

0

0

0

0

0

0

0

    

     L+nt

        +      *      (       )      id

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

       

    Atn L+nt

 

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 

 3.  Third term: (R+nt)T Ant                                                    

   

         (R+nt)T

0

0

0

0

0

0

0

0

0

1

1

1

1

1

1

 

     Ant

        +      *      (       )      id

 1

 0

 0

 1

 0

 0

 1

 0

 0

 0

 0

 0

 0

 0

 0

 

     (R+nt)T Ant                                                       

0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 0

 1

 1

 0

 1

 0

 1

 1

 0

 1

 0

 

In our case,  Ntt will be obtained from the second term and the third term only as the other two terms are both zero and it looks like the following:

 

        +      *      (        )      id

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

 0

 0

 1

 0

 1

 1

 1

 0

 1

 0

 1

 1

 0

 1

 0