Compilers and Interpreters: Solutions to the exam 2022-10-25

Please note that these are suggested solutions. There can be other solutions that are also correct. Some of these solutions may be more comprehensive than required for full points on the exam, or they might only refer to where the answers can be found. In addition, it has occurred in the history of the world that teachers have made errors in the suggested solutions. I myself have, of course, never made any such errors, but others have. Does anyone really read introductions like this one? I don't know. It may be so. Rhubarb rhubarb rhubarb. I have heard that the extras on a movie set are told to say that to simulate background noise in a restaurant or similar place. Below are the suggested solutions.

Task 1 (10 p)

A compiler's work is usually divided into a number of phases. Which are those phases? Explain briefly what each phase does. What is the input and the output of each phase?

Answer:

See also the book or the lecture notes, especially this figure, with the addition of the machine-dependent optimizer.

Task 2 (5 p)

We write a program to print a table of the first n powers of the number x. Unfortunately, we have made some errors in our code. In which phase of the compiler, or at which other times, are the following errors detected?

Program Error
#include <stdio.h>
#include <math.h>

int Main(void) {
    printf("Ange x: ");
    int x;
    scanf("%d", &x);
    printf("Ange n: );
    int *n;
    scanf("%d", n);
    for (int i = 1; i <= n; ++i) {
        int res = pow(x, i);
        printf("%d ** %d = %d\n", x, i, res);
    }
}



undefined reference to `main'



missing terminating " character

'n' is used uninitialized
comparison between pointer and integer
undefined reference to `pow'
expected ')' before ';' token


  1. linker
  2. scanner
  3. machine-independent code optimizer, or semantic analysis
  4. semantic analysis
  5. linker
  6. parser

Task 3 (6 p)

Here is a program segment written in a C-like language:
    if (x < y) {
        x = x + 1;
        z = x + y + z;
    }
    else {
        x = x / c;
        z = x + y + z * 3 / 4;
        y = 0;
    }

Translate the program segment to two of the following three types of representations. (Should you answer with all three, the one with the highest points will be discarded.)

a) an abstract syntax tree (by drawing it)

Answer:

An (abstract) syntax tree

b) postfix code for a stack machine

Answer:

        rvalue x
        rvalue y
        <
        gofalse ELSE-START
        lvalue x
        rvalue x
        push 1
        +
        =
        lvalue z
        rvalue x
        rvalue y
        +
        rvalue z
        +
        =
        jump AFTER-ELSE
label ELSE-START:
        lvalue x
        rvalue x
        push 2
        /
        =
        lvalue z
        rvalue x
        rvalue y
        +
        rvalue z
        push 3
        *
        push 4
        /
        +
        =
        lvalue y
        push 0
        =
label AFTER-ELSE:

c) three-address code

Answer:

        if x >= y goto ELSE-START
        x = x + 1
        t1 = x + y
        z = t1 + z
        goto AFTER-ELSE
label ELSE-START:
        x = x / 2
        t2 = x + y
        t3 = z * 3
        t4 = t3 / 4
        z = t2 + t4
        y = 0
label AFTER-ELSE:

Or, if we number the instructions and use those numbers instead of labels:

 (1)    if x >= y goto (6)
 (2)    x = x + 1
 (3)    t1 = x + y
 (4)    z = t1 + z
 (5)    goto (12)
 (6)    x = x / 2
 (7)    t2 = x + y
 (8)    t3 = z * 3
 (9)    t4 = t3 / 4
(10)    z = t2 + t4
(11)    y = 0
(12)    ....

Task 4 (5 p)

Here is some three-address code:

  (1) t1 = a + b
  (2) x = t1 / c
  (3) t2 = a + b
  (4) c = 4 * t2
  (5) t3 = x + y
  (6) t4 = a + b
  (7) if t3 >= t4 goto (11)
  (8) c = c + x
  (9) a = a - 1
 (10) goto (5)
 (11) ...

a) Some of the code might seem unnecessarily long, with all these temporary variables t1, t2 and so on. For example, couldn't the first two instructions have been replaced by the single instruction x = (a + b) / c? Explain!

Answer:

It's three-adress code, not four-address code! x, a, b and c are four variables.

b) Which basic blocks are there?

Answer:

Three blocks: instruction 1-4, instruction 5-7, and instruction 8-10.

c) Show an optimization that can be performed on the code!

Answer:

Instruction (1) and (3) have a common subexpression, a + b. (Not instruction (6), since a will have changed in later iterations of the loop.) The variable t2 and the instruction (3) can therefore be removed, and in instruction (4) we use t1 instead of t2.

The expression x + y is constant in the loop, so instruction (5) can be moved to before the loop.

Task 5 (5 p)

Here is a grammar. Names in italics are non-terminals. Names in bold are terminals. The start symbol is S.

S -> a T | a U
T -> b c
U -> b d

a) Why is this grammar not suitable for implementation in a top-down, predictive recursive-descent parser?

Answer:

There is a FIRST() conflict between the expansion a T and a U for the non-terminal S, so when the parser finds the token a in the input, it will not know which of the two productions to use. It wouldn't know which branch of the if statement to enter.

b) Show how the input a b c is parsed using a bottom-up parser with a stack. Show which shift and reduce operations are performed.

Answer:

  1. a is shifted. Now the stack contains a.
  2. b is shifted. Now the stack contains a b.
  3. c is shifted. Now the stack contains a b c.
  4. b c is reduced to T. Now the stack contains a T.
  5. a T is reduced to S. Now the stack contains S, and we are finished.

Task 6 (4 p)

a) (2p) Which terminals are needed to write a grammar for the input language in the scenario?

Answer:

meeting lecture lab done date time place

b) (2p) Out of these terminals, some will not have fixed lexemes. Write regular expressions for each such terminal.

Answer:

date: 2[0-9][0-9][0-9]-[0-1][0-9]-[0-2][0-9]
time: [0-2][0-9]:[0-5][0-9]
place: [A-Za-z0-9]+

Task 7 (4 p)

Write a grammar for the input language. The start symbol should be input, which represents a complete input as described in the scenario above.

Answer:

input -> commands done
commands -> command commands | ε
command -> meeting time_and_place | lecture time_and_place | lab time_and_place
time_and_place -> date time time place

Task 8 (8 p)

Write a predictive recursive-descent parser for the input language, in a language that is at least similar to C, C++, C# or Java. You do not have to write exactly correct program code, but it should be clear which procedures exist, how they call each other, and what comparisons with token types are made. You can assume there is a function called scan, which returns the type of the next token, and a function called error, which you can call when something went wrong and which prints an error message and terminates the program.

Answer:

For testing, you can download calendarparser.c, plus a Lex scanner specifikation calendar.l, and a Bison version of the grammar calendar.y, and a Makefile.

Since the grammar from the solution to the task above is already suitable for constructing a predictive recursive-descent parser, we can use it directly. Otherwise, we would have had to make some transformations.

int lookahead;

int scan(void) {
    return yylex();
}

void error() {
    printf("There has been an error. Sorry.\n");
    exit(EXIT_FAILURE);
}

void match(int expected) {
    if (lookahead != expected)
        error();
    else
        lookahead = scan();
}

void input(void);
void commands(void);
void command(void);
void time_and_place(void);

void input(void) {
    commands(); match(DONE);
}

void commands(void) {
    if (lookahead == MEETING || lookahead == LECTURE || lookahead == LAB) {
        command(); commands();
    }
    else {
        /* Empty */
    }
}

void command(void) {
    if (lookahead == MEETING) {
        match(MEETING); time_and_place();
    }
    else if (lookahead == LECTURE) {
        match(LECTURE); time_and_place();
    }
    else if (lookahead == LAB) {
        match(LAB); time_and_place();
    }
    else {
        error();
    }
}

void time_and_place(void) {
    match(DATE); match(TIME); match(TIME); match(PLACE);
}

void parse() {
    lookahead = scan();
    input();
    printf("Ok.\n");
}


Thomas Padron-McCarthy (thomas.padron-mccarthy@oru.se) November 8, 2022