Örebro University
School of Science and Technology
Thomas Padron-McCarthy (thomas.padron-mccarthy@oru.se)






Exam

Compilers and Interpreters

for Dataingenjörsprogrammet, and others

Thursday October 31, 2024

Exam for:

DT135G Kompilatorer och interpretatorer, provkod A001




Aids: No aids.
Score requirements: Maximum score is 40.
To pass, at least 24 points are required.
Results: Announced no later than 15 working days after the exam.
Return of the exams: Electronically through the student portal "Studenttjänster".
Examiner and teacher on call: Thomas Padron-McCarthy, phone 070-73 47 013.




GOOD LUCK!!

Formulas

1. Eliminating left recursion

A left-recursive grammar can be transformed to a grammar that is not left recursive. Assume that the grammar contains a rule (or, more correctly, two productions) like this:
A -> A x | y
A is a non-terminal, but x and y are any constructions consisting of terminals and non-terminals.

The rule is replaced by the following two rules (or, more correctly, three productions), that describe the same language, but are not left recursive:

A -> y R
R -> x R | empty

2. Left factorization

Assume that the grammar contains this rule (two productions):
A -> x y | x z
A is a non-terminal, but x, y and z are any constructions consisting of terminals and non-terminals.

Replace with these three productions:

A -> x R
R -> y | z

Task 1 (6 p)

Here is a program segment written in a C-like language:
    a = b;
    if (c < d) {
        e = f * g / h + i + j;
    }
    else {
        if (k == l) {
            m = n;
        }
        else {
            o = p;
            q = r;
        }
        s = t / u / v;
        w = x;
    }

Translate the program segment to two of the following three types of representations.

a) an abstract syntax tree (by drawing it)

b) postfix code for a stack machine

c) three-address code

Note: Two of the three types, not all three. Should you answer with all three, the one with the highest points will be discarded.

Task 2 (3 p)

Here is a grammar. There are three terminals: a, b and c. There are three non-terminals: S, T and U. The start symbol is S.

S -> S T | ε
T -> a U | a b U
U -> c

There are two problems with this grammar that makes it unsuitable for use by an LL(1) parser. That is the type of predictive top-down parser with one-token lookahead that we have seen for example in the 2.9 program. Explain the problems!

Task 3 (10 p)

Explain what the following things are, where they are used, and what they are used for:

a) activation record

b) basic block

c) loop unrolling

d) reference counting

Scenario for task 4-7

We want to define areas on a map, using the corners of polygons. Here we have a map of Örebro university, with the campus area and the university library:

We want to use a special language to define the areas, like this:

map University {
    area Campus {
        corner 59.2562 N 15.2470 E
        corner 59.2562 N 15.2557 E
        corner 59.2519 N 15.2561 E
        corner 59.2524 N 15.2558 E
        corner 59.2525 N 15.2520 E
        corner 59.2532 N 15.2502 E
        corner 59.2536 N 15.2448 E
        corner 59.2537 N 15.2418 E
        corner 59.2548 N 15.2421 E
    }
    area Library {
        corner 59.2542 N 15.2504 E
        corner 59.2542 N 15.2515 E
        corner 59.2536 N 15.2515 E
        corner 59.2537 N 15.2505 E
    }
}

As we can see the input consists of a map, marked by the keyword map and the name of the map, followed by a list of one or more areas within curly brackets. The name is a single word that consists of English-alphabet letters.

An area is marked by the keyword area and a name, followed by a list of one or more corner points within curly brackets.

Finally a corner point is marked by the keyword corner and its map coordinates, first latitude and then longitude, each with a direction which can be N for north, E for east, W for west or S for south.

It should be possible to write the input in free format, so whitespace (blanks, tabs and newlines) should not be significant, except for separating the tokens.

Task 4 (4 p)

a) (2p) Which terminals are needed to write a grammar for the input language in the scenario?

b) (2p) Out of these terminals, some will not have fixed lexemes. Write regular expressions for each such terminal.

Task 5 (5 p)

Write a grammar for the input language. The start symbol should be input, which represents a complete input as described in the scenario above.

Task 6 (4 p)

Given your grammar, draw a parse tree (also called a concrete syntax tree) for the following input:

map Kebnekaise {
    area Peak {
        corner 67.9187 N 18.5728 E
    }
}

Task 7 (8 p)

Write a predictive recursive-descent parser for the input language, in a language that is at least similar to C, C++, C# or Java. You do not have to write exactly correct program code, but it should be clear which procedures exist, how they call each other, and what comparisons with token types are made. You can assume there is a function called scan, which returns the type of the next token, and a function called error, which you can call when something went wrong and which prints an error message and terminates the program.