Languages, Automata
1 INTRODUCTION
This chapter discusses three topics, languages, automata, and grammars. These three topics are closely related to each other. Our languages will use the letters a, b, ... to code data rather than the digits 0 and 1 used by some other texts.
2 ALPHABET, WORDS, FREE SEMIGROUP
Consider a nonempty set A of symbols. A word or string w on the set A is a ?nite sequence of its elements. For example, suppose A = {a, b, c}. Then the following sequences are words on A:
u = ababb and v = accbaaa
When discussing words on A, we frequently call A the alphabet, and its elements are called letters. We will also abbreviate our notation and write a2 for aa, a3 for aaa, and so on. Thus, for the above words, u = abab2 and v = ac2 ba3 .
The empty sequence of letters, denoted by l (Greek letter lambda) or (Greek letter epsilon), or 1, is also considered to be a word on A, called the empty word. The set of all words on A is denoted by A? (read: “A star”).
The length of a word u, written |u| or l(u), is the number of elements in its sequence of letters. For the above words u and v, we have l(u) = 5 and l(v) = 7. Also, l(l) = 0, where l is the empty word.
Remark: Unless otherwise stated, the alphabet A will be ?nite, the symbols u, v, w will be reserved for words on A, and the elements of A will come from the letters a, b, c.
Concatenation
Consider two words u and v on the alphabet A. The concatenation of u and v, written uv, is the word obtained by writing down the letters of u followed by the letters of v. For example, for the above words u and v, we have
uv = ababbaccbaaa = abab2 ac2 ba3
As with letters, for any word u, we de?ne u2 = uu, u3 = uuu, and, in general, un+1 = uun .
.