KOI: Lab Exercise 4
This exercise is about lexical analysis (scanning),
with a hand-crafted deterministic finite state machine,
and with a Lex-generated scanner.
(But you will most likely be using Flex and not the original Lex.)
Resources
Learn about DFAs!
A finite state automaton, also called finite state machine,
(in Swedish: ändlig tillståndsmaskin, ändlig tillståndsautomat),
can be in one of several states.
For example, here are three states:
It is called finite (Swedish: "ändlig")
since it only has a finite (Swedish: ändligt)
number of states.
We add some transitions, or edges, that lets the machine
change state from one state to another:
The arrow pointing into state 1 is marked "Start",
which means that we always start in state 1.
From state 1 we can only move to state 2.
When we are in state 2, we can either go back to state 1,
or go on to state 3.
State 3 is the end state, marked with a double ring.
In a deterministic finite state automaton, or DFA
(Swedish: deterministisk tillståndsmaskin)
transitions are determined by some sort of input:
We move from state 1 to state 2 if we get an a as input.
From state 2, we move back to state 1 if we get a b as input,
and we move on to state 3 if we get an a.
To get from the start state to the end state,
we must get an input string that starts with an a,
and then has zero or more ba pairs.
As soon as we get two a in a row, we move to the end state.
These strings will work:
b
- aa
- abaa
- ababaa
- abababaa
- ...
If other input symbols than a and b can occur in the input,
we should handle them too. We add a fourth state,
which is also an end state, and were we
will arrive as soon as we find anything else than a and b
in the input:
Do!
In part A you will implement a finite state machine,
and in part B you will use Flex to build a scanner.
Part A
Implement the DFA described above as a program.
The program should read characters from standard input,
and report which state it is in.
There are several common ways to implement a state machine like this one.
- With goto and labels
- With a switch statement inside a while (1) loop
- With a table of states, input and transitions
Choose one of these ways, or try to invent a new one!
Part B
Continue building on the compiler from the previous assignment by
replacing the hand-coded scanner from the
2.9 program
(see the file
lexer.c)
with a Lex-generated one.
(But you will most likely be using Flex and not the original Lex.)
It should recognize all the token types expected by the parser.
Report!
Show your results and discuss them with the teacher,
or,
send an
e-mail
with clear and full explanations of what you have done.
Include well-chosen and commented test runs with input and output.
If you use attachments, send them as PDF and not Word.
Also send the source code, as a Zip file or equivalent.
Thomas Padron-McCarthy
(thomas.padron-mccarthy@oru.se)
November 28, 2022