Örebro University
School of Science and Technology
Thomas Padron-McCarthy (thomas.padron-mccarthy@oru.se)






Exam

Compilers and Interpreters

Wednesday August 15, 2024

Exam for:

DT135G Kompilatorer och interpretatorer, provkod A001







Aids: No aids.
Score requirements: Maximum score is 41.
To pass, at least 24 points are required.
Results: Announced no later than 15 working days after the exam.
Return of the exams: Electronically through the student portal "Studenttjänster".
Examiner and teacher on call: Thomas Padron-McCarthy, phone 070-73 47 013.




GOOD LUCK!!

Formulas

1. Eliminating left recursion

A left-recursive grammar can be transformed to a grammar that is not left recursive. Assume that the grammar contains a rule (or, more correctly, two productions) like this:
A -> A x | y
A is a non-terminal, but x and y are any constructions consisting of terminals and non-terminals.

The rule is replaced by the following two rules (or, more correctly, three productions), that describe the same language, but are not left recursive:

A -> y R
R -> x R | empty

2. Left factorization

Assume that the grammar contains this rule (two productions):
A -> x y | x z
A is a non-terminal, but x, y and z are any constructions consisting of terminals and non-terminals.

Replace with these three productions:

A -> x R
R -> y | z

Task 1 (10 p)

A compiler's work is usually divided into a number of phases. Which are those phases? Explain briefly what each phase does. What is the input and the output of each phase?

Task 2 (6 p)

Here is a program segment written in a C-like language:

    a = b * c;
    if (d < e * f / g + h) {
        while (i < j) {
            k = l + m * n + o;
        }
        p = q;
        if (r < s) {
            t = u;
        }
        else {
            v = w;
        }
    }
    x = y + z;

Translate the program segment to two of the following three types of representations. (Should you answer with all three, the one with the highest points will be discarded.)

a) an abstract syntax tree (by drawing it)

b) postfix code for a stack machine

c) three-address code

Note: Two of the three types, not all three.

Task 3 (4 p)

Which information is stored in the symbol table? Which phases use the symbol table, and how?

Task 4 (4 p)

In C we de-allocate memory that we no longer need using free. In C++ we can also use delete. Many other languages have automatic garbage collection. Describe what that means, and explain briefly how it works, especially how it can determine which memory we no longer need.

Scenario for task 5-7

With today's multi-core processors, which can run multiple processes or threads of execution in parallel, it is becoming more and more important to be able to write programs that do several things at the same time. A thread (in Swedish tråd) is a sequence of instructions that can be performed in parallel with other threads. Multiple threads can run in parallel, and they can share the same memory, so they can work with shared variables.

We want to create a simple programming language that works with threads. There is already a Bison grammar for the language:

%{
    #include <stdio.h>
    extern int yylex(void);
    extern void yyerror(const char*);
%}

%define parse.error verbose
%start program
%token VAR ID NUMBER THREAD PRINT 

%%

program : variables threads ;

variables : variable variables
             | %empty
             ;

variable : VAR ID '=' NUMBER ';' ;

threads : thread threads
             | %empty
             ;

thread : THREAD NUMBER '{' operations '}' ;

operations : operation operations
         | %empty
         ;

operation : ID '=' ID ';'
        | ID '=' NUMBER ';'
        | ID '=' ID '+' ID ';'
        | ID '=' ID '+' NUMBER ';'
        | PRINT '(' ID ')' ';'
        ;
       
%%

There is also a Flex specification for the terminals of the language:

%{
    #include "threads.tab.h"
%}

%%

[\n\t ]         { /* Ignore whitespace */ }
"="             { return '='; }
";"             { return ';'; }
"{"             { return '{'; }
"}"             { return '}'; }
"("             { return '('; }
")"             { return ')'; }
"+"             { return '+'; }
"var"           { return VAR; }
"thread"        { return THREAD; }
"print"         { return PRINT; }
[0-9]+          { return NUMBER; }
[A-Za-z][A-Za-z0-9]*    { return ID; }
.               { fprintf(stderr, "[ignoring unknown character '%c']\n", *yytext); }

%%

Task 5 (5 p)

Show what our new thread-based programming language looks like by giving three examples of programs in that language. Try to create illuminating examples.

Also describe some constructs that the language will not allow, especially if there are any that might be surprising to a user.

Task 6 (4 p)

Select one of your programs from the task above, or write a new one, and show the parse tree (also called a concrete syntax tree) for it.

Task 7 (8 p)

Write a predictive recursive-descent parser for the thread-based programming language, in a language that is at least similar to C, C++, C# or Java. You do not have to write exactly correct program code, but it should be clear which procedures exist, how they call each other, and what comparisons with token types are made. You can assume there is a function called scan, which returns the type of the next token, and a function called error, which you can call when something went wrong and which prints an error message and terminates the program.