a = b; if (c < d) { e = f * g / h + i + j; } else { if (k == l) { m = n; } else { o = p; q = r; } s = t / u / v; w = x; }
Translate the program segment to two of the following three types of representations.
a) an abstract syntax tree (by drawing it)
Answer:
The "K" nodes are connection nodes for lists of statements. Sometimes I have used ";" to mark these nodes.
b) postfix code for a stack machine
Answer:
lvalue a rvalue b = rvalue c rvalue d < gofalse ELSE-START-1 lvalue e rvalue f rvalue g * rvalue h / rvalue i + rvalue j + = goto AFTER-IF-1 label ELSE-START-1: rvalue k rvalue l == gofalse ELSE-START-2 lvalue m rvalue n = goto AFTER-IF-2 label ELSE-START-2: lvalue o rvalue p = lvalue q rvalue r = label AFTER-IF-2: lvalue s rvalue t rvalue u / rvalue v / = lvalue w rvalue x = label AFTER-IF-1:
c) three-address code
Answer:
a = b if (c >= d) goto ELSE-START-1 temp1 = f * g temp2 = temp1 / h temp3 = temp2 + i e = temp3 + j goto AFTER-IF-1 label ELSE-START-1: if (k != l) goto ELSE-START-2 m = n goto AFTER-IF-2 label ELSE-START-2: o = p q = r label AFTER-IF-2: temp4 = t / u s = temp4 / v w = x label AFTER-IF-1:
S -> S T | ε
T -> a U | a b U
U -> c
There are two problems with this grammar that makes it unsuitable for use by an LL(1) parser. That is the type of predictive top-down parser with one-token lookahead that we have seen for example in the 2.9 program. Explain the problems!
Answer:
a) activation record
Answer:
When a function is called during the execution of a program, we need somewhere to store that function's local variables, the place in the program to jump back to when the function returns, and some other data. This is stored in the activation record, which is an area in memory, usually on the call stack.
b) basic block
Answer:
A basic block is a sequence of instructions, for example in three-address code, where all the instructions will always be executed from start to finish. Thus there can be no jumps into the basic block, except to the first instruction, and no jumps out of the basic block, except from the last instruction. The basic blocks are used by the machine-independent code optimizer so it can re-use results that were calculated previously in the same block.
c) loop unrolling
Answer:
Loop unrolling is an optimization technique, used in the optimizer. It can sometimes replace a loop by several instances of the loop body, after one another. It makes the executable code longer, but since it doesn't need jumps and tests it can be faster.
d) reference counting
Answer:
Reference counting is a method to identify memory that is no longer in use, by keeping track of how many pointers point to each allocated memory area. Whenever we add, remove or change a pointer, we need to update the reference counters for the memory areas that are, or were, pointed to by those pointers. When the reference count for a memory area reaches zero, we know that there are no pointers to that area, so the area is unreachable and can be reclaimed.
Answer:
LEFTBRACE, RIGHTBRACE, MAP, AREA, CORNER, NORTH, EAST, SOUTH, WEST, NAME and COORDINATE
b) (2p) Out of these terminals, some will not have fixed lexemes. Write regular expressions for each such terminal.
Answer:
NAME: [A-Z][a-z]*
COORDINATE: [0-9]+\.[0-9]+
Answer:
input -> MAP NAME LEFTBRACE areas RIGHTBRACE
areas -> areas area | area
area -> AREA NAME LEFTBRACE corners RIGHTBRACE
corners -> corners corner | corner
corner -> CORNER COORDINATE direction COORDINATE direction
direction -> NORTH | EAST | SOUTH | WEST
(In reality the latitude can only have N and S, and the longitude can only have E and W, but the description in the scenario is not clear about that.)
map Kebnekaise { area Peak { corner 67.9187 N 18.5728 E } } |
Answer:
Answer:
The grammar in task 5 above is left recursive. We need to eliminate the left recursion before we can write a top-down parser:
input -> map
map -> MAP NAME LEFTBRACE areas RIGHTBRACE
areas -> area moreareas
moreareas -> area moreareas | ε
area -> AREA NAME LEFTBRACE corners RIGHTBRACE
corners -> corner morecorners
morecorners -> corner morecorners | ε
corner -> CORNER COORDINATE direction COORDINATE direction
direction -> NORTH | EAST | SOUTH | WEST
For testing, you can download mapparser.c, plus a Lex scanner specification map.l, and a Bison version of the grammar map.y, and a Makefile.
int lookahead; void match(int expected) { if (lookahead != expected) error(); else lookahead = scan(); } void input(void) { match(MAP); match(NAME); match(LEFTBRACE); areas(); match(RIGHTBRACE); } void areas(void) { area(); moreareas(); } void moreareas(void) { if (lookahead == AREA) { area(); moreareas(); } else { // Empty: Any lookahead matches, since you can always fit nothing anywhere } } void area(void) { match(AREA); match(NAME); match(LEFTBRACE); corners(); match(RIGHTBRACE); } void corners(void) { corner(); morecorners(); } void morecorners(void) { if (lookahead == CORNER) { corner(); morecorners(); } else { // Empty: Any lookahead matches, since you can always fit nothing anywhere } } void corner(void) { match(CORNER); match(COORDINATE); direction(); match(COORDINATE); direction(); } void direction(void) { if (lookahead == NORTH) { match(NORTH); } else if (lookahead == EAST) { match(EAST); } else if (lookahead == SOUTH) { match(SOUTH); } else if (lookahead == WEST) { match(WEST); } else { error(); } } void parse() { lookahead = scan(); input(); printf("Ok.\n"); } int main() { printf("Write a complete input, ending with EOF:\n"); parse(); }