The Structure of the Compiler
A compiler takes as input a Source program and produces as output an equivalent sequence of machine instructions. This process is so complex that it is not reasonable, either from logical point of view or from an implementation point of view, to consider the compilation process as occurring in one single step. For this reason, it is customary to partition the compilation process into a series of sub process called phases, as shown in fig (3).A phase is logically cohesive operation that takes as input one representation of the source program and produce as output another representation.
Intermediate Code Generation
|
Fig (3) Phases of Compiler
|
The first phase, called the lexical analysis phase which is done by the "Lexical Analyzer" or "Scanner", separates characters of the source language into groups that logically belong together; these groups are called tokens. The usual tokens are keywords (IF, ELSE, DO, …) ,Identifiers (X, Y, num,….) ,operator symbol (>,>=,=,-,+,…), and punctuation symbols (parentheses ,commas). The output of the lexical analyzer is a stream of token, which is passed to the next phase, the syntax analysis phase which is done by the syntax analyzer or parser. The token in this stream can be represented by codes which we may regard as integers. Thus, "IF" might be represented by 1, "+" by 3, and "identifier" by 4. In the case of a token like "identifier" a second quantity, telling which of these identifiers used by the program is represented by this instance of token "identifier", is passed along with the integer code for "identifier".