COSC4307
Relations on Grammar
Symbols
1.
Leftmost-Deriving: L+nv
2.
Rightmost-Deriving:
R+nv
3.
Adjacent:
Avv
4.
Next-to:
Ntt
1.
Leftmost-Deriving.
Each nonterminal deriving some grammar symbols as
the
leftmost symbol (or the first) in one or more
steps.
L+nv
= L*nn * Lnv
Example-Grammar
E -> E + T
E -> T
T -> T * F
T -> F
F -> ( E )
F -> id
Lnv
E |
T |
F |
+ |
* |
( |
) |
id | |
E |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
T |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
F |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
ß-- Lnn
---àß------- Lnt
-------à
L* nn =
L0 nn
+ L1nn +
… + L n-1
nn
1 0 0 1
1 0 1 1
1
1 1 1
= 0
1 0 + 0 1
1 + 0 1
1 = 0 1
1
0 0
1
0 0 0 0 0
0
0 0 1
L+
nv
= L* nn
* L
nv
1 1 1
1 1 0
0 0 0
0 0
= 0 1
1 * 0 1
1 0 0
0 0 0
0 0 1
0 0 0
0 0 1
0 1
1 1 1
0 0 1
0 1
= 0 1
1 0 0
1 0 1
0 0 0
0 0 1
0 1
From above L+ nv , we see that, among
others,
The
nonterminal E can leftmost-derive five symbols,
E,T,F, id, and ( while it
does not leftmost-derive three other symbols
in any number of
steps.
2.
Rightmost-Deriving.
Each nonterminal deriving some grammar symbols
as
the rightmost (or the last) symbol in one or more
steps.
R+ nv
= R* nn * R nv
Just like L* nn
,
R* nn
= R0
nn + R1 nn + R2 nn +
… + Rn-1 nn
For our example grammar,
R nv looks like:
E T F + *
(
)
id
---------------------------------------------------------
E
0
1
0
0
0
0
0
0
T
0
0
1
0
0
0
0
0
F 0 0 0 0 0 0 1
1
And,
R*nn
is:
1 0
0 0
1 0 0 0
1
1 1 1
= 0 1
0 + 0 0
1 + 0 0
0 = 0 1
1
0 0
1
0 0 0 0 0
0
0 0 1
And, R+
nv = R* nn *
R nv
1 1
1 0 1
0 0 0 0 0 0
= 0 1 1 * 0 0 1 0 0 0 0
0
0 0 1
0 0 0 0 0 0 1 1
0 1
1 0 0 0 1 1
= 0 0 1 0 0 0 1
1
0 0 0 0 0 0 1 1
From above, we
can see that, nonterminal E rightmost-derives four symbols, T, F, ), and id
while it does not rightmost-derive the other four symbols, E, +, *, and ( in any
number of steps.
3.
Adjacency.
This relation is about whether or not any two grammar symbols
do
appear adjacently on the righthand side of any production rule of the
grammar.
For example,
in the example grammar under consideration,
E -> E + T
E -> T
T -> T * F
T -> F
F -> ( E )
F -> id
we see that
each of the following two symbols are adjacent in that
given
direction.
E +
+ T
T *
*
F
(
E
E )
Note that this relation is not symmetric. That is, while “E +” are
adjacent, “+ E” may not be
and in fact, they are.
This relation can be represented by a matrix of size v-by-v (where v is
the number of
all grammar symbols).
We call this matrix “A”
A vv looks like in our example case
E |
T |
F |
+ |
* |
( |
) |
id | |
E |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
T |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
F |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+ |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
* |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
( |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
) |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
id |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
4. Next-to
In
this relation, we are interested in knowing whether or not any two terminal
symbols of the given grammar can possibly appear next to each other in any
sentence that can be derived by the grammar.
Clearly, the source program
has a syntax error if two terminals happen to appear next to each other when
they are not supposed to.
On
the other hand, even when every two consecutive terminals appearing in a source
program can so appear, the source program may not be free of syntax errors. So,
a matrix N of size t-by-t representing this relation shows a minimum syntax
requirement that any syntactically correct program must
satisfy.
This matrix can be
callculated as follows:
Ntt = Att + Atn
L+nt + (R+nt)T
Ant +
(R+nt)T Ann
L+nt
We
observe that two terminals t1 and t2 are a next-to pair if
1. they are adjacent (since
once two terminals are derived adjacently they remain adjacent and hence next-to
each other up to the end), which is what the first term above is for
or
2. t1 and a nonterminal P are
adjacent in that order and t2 can be leftmost-derived by this nonterminal P in
one or more steps, which is what the second term of the above formula is for
or
3. a certain nonterminal P and
t2 are adjacent in that order and t1 can be rightmost-derived by this
nonterminal P in one or more steps, which is what the third term of the above
formula is for, or
4. two nonterminals P and Q are
adjacent in that order and t1 can be rightmost-derived by the first nonterminal
P of the two and t2 can be leftmost-derived by the second nonterminal Q of the
two, which is what the last term of the above formula is for.
We will calculate this
matrix for our example grammar.
1. The first term is zero.
2. Second term: Atn
L+nt
Atn
0 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
L+nt
+ * ( )
id
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
Atn
L+nt
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
3.
Third term: (R+nt)T Ant
(R+nt)T
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
Ant
+ * ( )
id
1 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
(R+nt)T Ant
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
1 |
0 |
In our case, Ntt will be obtained from the
second term and the third term only as the other two terms are both zero and it
looks like the following:
+ * (
)
id
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
1 |
0 |