Örebro University
School of Science and Technology
Thomas Padron-McCarthy (thomas.padron-mccarthy@oru.se)






Exam

Compilers and Interpreters

for Dataingenjörsprogrammet, and others

Tuesday October 25, 2022

Exam for:

DT135G Kompilatorer och interpretatorer, provkod A001




Aids: No aids.
Score requirements: Maximum score is 47.
To pass, at least 26 points are required.
Results: Announced no later than 15 working days after the exam.
Return of the exams: Electronically through the student portal "Studenttjänster".
Examiner and teacher on call: Thomas Padron-McCarthy, phone 070-73 47 013.




GOOD LUCK!!

Formulas

1. Eliminating left recursion

A left-recursive grammar can be transformed to a grammar that is not left recursive. Assume that the grammar contains a rule (or, more correctly, two productions) like this:
A -> A x | y
A is a non-terminal, but x and y are any constructions consisting of terminals and non-terminals.

The rule is replaced by the following two rules (or, more correctly, three productions), that describe the same language, but are not left recursive:

A -> y R
R -> x R | empty

2. Left factorization

Assume that the grammar contains this rule (two productions):
A -> x y | x z
A is a non-terminal, but x, y and z are any constructions consisting of terminals and non-terminals.

Replace with these three productions:

A -> x R
R -> y | z

Task 1 (10 p)

A compiler's work is usually divided into a number of phases. Which are those phases? Explain briefly what each phase does. What is the input and the output of each phase?

Task 2 (5 p)

We write a program to print a table of the first n powers of the number x. Unfortunately, we have made some errors in our code. In which phase of the compiler, or at which other times, are the following errors detected?

Program Error
#include <stdio.h>
#include <math.h>

int Main(void) {
    printf("What is x: ");
    int x;
    scanf("%d", &x);
    printf("What is n: );
    int *n;
    scanf("%d", n);
    for (int i = 1; i <= n; ++i) {
        int res = pow(x, i);
        printf("%d ** %d = %d\n", x, i, res;
    }
}



undefined reference to `main'



missing terminating " character

'n' is used uninitialized
comparison between pointer and integer
undefined reference to .pow'
expected ')' before ';' token


Task 3 (6 p)

Here is a program segment written in a C-like language:
    if (x < y) {
        x = x + 1;
        z = x + y + z;
    }
    else {
        x = x / 2;
        z = x + y + z * 3 / 4;
        y = 0;
    }

Translate the program segment to two of the following three types of representations. (Should you answer with all three, the one with the highest points will be discarded.)

a) an abstract syntax tree (by drawing it)

b) postfix code for a stack machine

c) three-address code

Task 4 (5 p)

Here is some three-address code:

  (1) t1 = a + b
  (2) x = t1 / c
  (3) t2 = a + b
  (4) c = 4 * t2
  (5) t3 = x + y
  (6) t4 = a + b
  (7) if t3 >= t4 goto (11)
  (8) c = c + x
  (9) a = a - 1
 (10) goto (5)
 (11) ...

a) Some of the code might seem unnecessarily long, with all these temporary variables t1, t2 and so on. For example, couldn't the first two instructions have been replaced by the single instruction x = (a + b) / c? Explain!

b) Which basic blocks are there?

c) Show an optimization that can be performed on the code!

Task 5 (5 p)

Here is a grammar. Names in italics are non-terminals. Names in bold are terminals. The start symbol is S.

S -> a T | a U
T -> b c
U -> b d

a) Why is this grammar not suitable for implementation in a top-down, predictive recursive-descent parser?

b) Show how the input a b c is parsed using a bottom-up parser with a stack. Show which shift and reduce operations are performed.

Scenario for task 6-8

There are a number of electronic calendars, such as Google Calendar or Apple iCloud Calendar. Now we want to create our own electronic calendar, and this calendar will use an input language to schedule activities.

Here is an example of how this input language will look:

meeting 2022-10-25 10:00 12:00 T101
meeting 2022-10-25 13:00 15:00 Zoom
lecture 2022-10-27 15:15 17:00 T131
lab 2022-10-28 08:15 12:00 T004
meeting 2022-10-26 10:00 12:00 Zoom
done

This input language should have four keywords: meeting, to schedule a meeting, lecture, to schedule a lecture, lab, to schedule a lab, and done, to mark the end of the input.

The commands meeting, lecture and lab require four arguments: a date, a starting time, an ending time, and a place. The place is a single word without spaces.

It should be possible to write the input in free format, so the following input should be equivalent to the one above:

meeting
   2022-10-25

10:00 12:00 T101 meeting 2022-10-25 13:00 15:00 Zoom
lecture 2022-10-27 15:15
   17:00
     T131
lab
  2022-10-28   08:15   12:00
   T004
meeting   2022-10-26   10:00   12:00   Zoom done

Task 6 (4 p)

a) (2p) Which terminals are needed to write a grammar for the input language in the scenario?

b) (2p) Out of these terminals, some will not have fixed lexemes. Write regular expressions for each such terminal.

Task 7 (4 p)

Write a grammar for the input language. The start symbol should be input, which represents a complete input as described in the scenario above.

Task 8 (8 p)

Write a predictive recursive-descent parser for the input language, in a language that is at least similar to C, C++, C# or Java. You do not have to write exactly correct program code, but it should be clear which procedures exist, how they call each other, and what comparisons with token types are made. You can assume there is a function called scan, which returns the type of the next token, and a function called error, which you can call when something went wrong and which prints an error message and terminates the program.