انت هنا الان : شبكة جامعة بابل > موقع الكلية > نظام التعليم الالكتروني > مشاهدة المحاضرة
الكلية كلية تكنولوجيا المعلومات
القسم قسم البرامجيات
المرحلة 3
أستاذ المادة أسعد صباح هادي الجبوري
12/27/2011 7:39:08 PM
Implementing the Lexical Analyzer The LEX can build from its input a lexical analyzer that behaves roughly like a finite automaton. The idea is to construct an NFA Ni for each token pattern Pi in the translation rules, then link these NFA s together with a new start states as shown in fig (25). Next we convert this NFA to a DFA.
There are several nuance in this procedure of which the reader should be aware. First, there are in the combined NFA several different "accepting states". That is, the accepting state of each Ni indicates that its own token, Pi has been found. When we convert to a DFA, the subsets we construct may include several different final states. Moreover the final states lose some of their significance, since we are looking for the longest prefix of the input which matches some patterns. After reaching a final state, the lexical analyzer must continue to simulate the DFA until it reaches a state with no next state for the current input symbol. Upon reaching termination, it is necessary to review the states of the DFA which we have entered while processing the input. Each such state represents a subset of the NFA s states, and we look for the last DFA state which includes a final state for one of the pattern-recognizing NFA s Ni. That final state indicates which token we have found. If none of the states which the DFA has entered includes any final state of the NFA, then we have an error condition. If the last DFA state to include a final NFA state in fact includes more than one final state, then the final state for the pattern listed first has priority. Example: Suppose we have the following LEX program. AUXILIARY DEFINITION (none) TRANSLATION RULES a {} /* actions are omitted here*/ abb {} a*b+ {} The three tokens above are recognized by the simple automata of fig (26). We may convert the NFA s of fig (26) into one NFA as described earlier. The result is shown in fig (27). Then this NFA may be converted to a DFA using the algorithm and the transition table is shown in fig (28), where the states of the DFA have been named by lists of the states of the NFA. The last column in fig (28) indicates the token which will be recognized if that state is the last state entered that recognizes any token at all. In all cases but the last line, state 68, the token recognized is the only token whose final state is included among the NFA states forming the DFA state. For example, among NFA states 2, 4, and 7, only 2 is final, and it is the final state of the automaton for regular expression a in fig (26). Thus, DFA state 247 recognizes token a . In the case of DFA state 68, both 6 and 8 are final states of their respective nondeterministic automata. Since the translation rules of our LEX program mention abb before a*b+, NFA state 6 has priority, and we announce that abb has been found in DFA state 68.
?
Fig (27) NFA recognizing three different tokens
State a b Token found 0137 247 8 none 247 7 58 a 8 --- 8 a*b+ 7 7 8 none 58 --- 68 a*b+ 68 --- 8 abb Fig (28) Transition Table for DFA.
Suppose that the first input characters are aba. The DFA of fig. (28) starts off in state 0137. On input a it goes to state 247. Then on input b it progresses to state 58, and on input a it has no next state. We have thus reached termination, progressing through the DFA states 0137, then 247, then 58. The last of these includes the final NFA state 8 from fig (26). Thus the action for state 58 of the DFA is to show that the token a*b+ has been recognized and to select ab, the prefix of the input that led to state 58, as TOKEN. If the DFA state 58, that is the last state entered before termination not include a final state of some NFA, then we would consider the DFA state previously entered, that is state 247 and recognize the token a , which state 247 call for. The prefix a would in that case be TOKEN. In fig (29) we see NFA s for the patterns of fig (24) . We use here a simple NFA for identifiers and constants. ?
The DFA constructed from fig. (29) is shown in fig (30). Each state of the deterministic automata is a subset of the states of the nondeterministic automaton. State S0 is the set of states {0, 1, 7, 11, 14, 19, 24, 26, 28, 30, 33, 35, 38, 40}
? The states marked # have transition to state {25} on all letters and digit for which no transition is shown. On input such as THE followed by a non letter or digit, the DFA break in state {17, 25}, which contain accepting state 25 of the nondeterministic automaton. That state signals that an identifier has been found and the appropriate action is taken. On input <A , the lexical analyzer goes to state {29, 31, 36} on < , from which no transition on A is possible. On the three state of NFA, only state 29 is accepting and it indicates that the token < has been found, and take the specified action. If we want to minimize the DFA, we must use one of the methods for minimization, and after that we see that the result DFA is the same.
المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .
|