Conversion of a CFG
into
Chomsky Normal Form
Chomsky Normal Form
Chomsky Normal Form (CNF) has only productions of the form:
A ? B C or
A ? a
with A, B, C IV and aIT.
This means all rules are consisting of either a single terminal on the PRODUCTIONS, or they are binary, consisting of two variables.
Here is the algorithm to convert a given CFG into CNF.
0. Make the initial grammar G free of ?-productions, unit-productions, and useless variables and productions.
You have rules now, which contain one terminal on the PRODUCTIONS or at least two symbols from V E T on the PRODUCTIONS.
1. We substitute terminals ai, which appear in complex PRODUCTIONS (with more than one symbol), with new variables C1, C2,...
Add these C1, C2,... to V to form V .
We have to add matching new productions: Ci ? ai to P to form P .
Productions in P have either a single terminal on the PRODUCTIONS, or at least 2 variables:
A ? a or A ? V1 ... Vn with Vi I V for i=1,...,n and n?2
2. We split up the non-terminal productions in P and make them "binary".
We introduce new variables D1, D2, ... which we add to V .
Every A ? V1 ... Vn in P is now substituted with a set of new productions:
A ? V1 D1
D1 ? V2 D2
D2 ? V3 D3
. . .
Dn-1 ? Vn-1 Vn
The grammar is now in Chomsky NF: