Typsystem och typkontroll

The course Compilers and interpreters | Lectures: 1 2 3 4 5 6 7 8 9 10 11 12

These lecture notes are my own notes that I made in order to use during the lecture, and it is approximately what I will be saying in the lecture. These notes may be brief, incomplete and hard to understand, not to mention in the wrong language, and they do not replace the lecture or the book, but there is no reason to keep them secret if someone wants to look at them.

Idag: Datatyper. Typsystem. Typkontroll.

ALSU-07 avsnitt 6.3 och 6.5
(ASU-86 6.1-6.2)

Statiska och dynamiska kontroller

Static checking = performed by the compiler during compilation
Dynamic checking = performed during execution of the target program

Some static checks:

Type checks
Flow-of-control checks
Uniqueness checks
Name-related checks

Examples:

int main() {
  int i;

  *i = 2; // invalid type argument of `unary *'
  break;  // break statement not within loop or switch
 e:
 e:       // duplicate label `e'
  return 0;
}

6.3 Type systems

Each expression is of a certain type. Examples from C:

1 / 3 -- "int / int = int" -- 0
1 / 3.0 -- "int / double = (double)int / double = double" -- 0.3333333

Basic types (int, char, etc.)
"Constructed" types (pointer to ... , array ... of ... , struct { ... })

Type expressions

A type expression is:

A basic type (int, char, etc.)
A type name (such as T after typedef int T;)
A type variable (can be used directly in some languages: type XT = typeof(x); XT y;)
A type constructor applied to one or more type expressions:
- Pointers: pointer(T) -- C ex: *int
- Arrays: array(I, T) -- Pascal ex: array[1..10] of integer; -- C ex: int a[10]
- Products: T₁ x T₂ (that is, a "tuple", or "argument list")
- Records: record(...) -- C ex: struct Customer { int nr; char name[10]; }
- Functions: int x int -> int -- C++ ex: int f(int, int);

Representation. ASU-86 fig 6.2.
char x char -> pointer(integer)
C: int* f(char, char);

Type tree:

        ->
      /   \
     x     pointer
    / \       \
char   char  integer

Type DAG:

        ->
      /   \
     x     pointer
    ||        \
   char      integer

If you haven't seen it before: cdecl

cdecl> declare a as pointer to int
int *a
cdecl> declare a as array 10 of pointer to function returning double
double (*a[10])()
cdecl> explain int *(*foo)[8]
declare foo as pointer to array 8 of pointer to int

Type systems

Type system = a set of rules for assigning type expressions to the various parts of a program.

Static and dynamic checking of types

Everything can be checked dynamically. There are some things that can't be checked statically. Example:

class Djur { public: int x; virtual void f() { } };
class Hund : public Djur { public: int y; };
class Katt : public Djur { public: int z; };

void f(Djur* djur) {
  Hund* hund = dynamic_cast<Hund*>(djur);
  if (hund != 0)
    hund->y = 4711;
}

int main() {
  f(new Djur());
  f(new Hund());
  f(new Katt());
}

Strong and weak typing

Sound type system = allows us to determine statically (that is, at compile time) that type errors can't occur

Strongly typed language = the compiler can guarantee that the program will execute without type errors.

Specification of a simple type checker

A type checker implemented as a translation scheme that synthesizes the type of each expression, from the types of its subexpressions.

A simple language

Four basic types: integer, char, type_error, void
Arrays and pointers
Addition

Grammar (slighly modified from the one on ASU-86 page 349):

Program --> Declarations ; Expr
Declarations --> Declarations ; Declarations | id : Type
Type --> char | integer | array [ num ] of Type | *Type
Expr --> literal | num | id | Expr + Expr | Expr [ Expr ] | *Expr

A literal is a character constant, for example 'a'.

Example program (with a type error?):

n : integer;
i : integer;
a : array [256] of char;
n + a[i] + 7

Idea: Add an attribute type to each node in the parse tree!

A translation scheme, with semantic actions, can be used for type checking. This part of the translation scheme saves the type of a variable in the symbol table (ASU-86 Fig 6.4):

Type --> char { Type.type := char; }
Type --> integer { Type.type := integer; }
Type --> *Type₁ { Type.type := pointer(Type₁.type); }
Type --> array [ num ] of Type₁ { Type.type := array(1..num.val, Type₁.type); }
Declarations --> id : Type { addtype(id.entry, T.type); }
Program --> Declarations ; Expr
Declarations --> Declarations ; Declarations

Type checking of expressions

Explicit numbers are integers, etc:

Expr --> num { Expr.type := integer; }
Expr --> literal { Expr.type := char; }

Variables have the type they were declared as:

Expr --> id { Expr.type := lookup(id.entry); }

Adding two integers gives an integer, anything else is a type error:

Expr --> Expr₁ + Expr₂
  { if (Expr₁.type == integer and Expr₂.type == integer)
      Expr.type = integer;
    else
      Expr.type = type_error;
  }

Indexing into an array:

Expr --> Expr₁ [ Expr₂ ]
  { if (Expr₂.type == integer and Expr₁.type == array(s, t))
      Expr.type = t;
    else
      Expr.type = type_error;
  }

Type checking of statements

A statement has no value, so its type is void.

The while statement:

Stmt --> while ( Expr ) Stmt₁
  { if (Expr.type == integer and Stmt₁.type == void)
      Stmt.type == void;
    else
      Stmt.type == type_error;
  }

The expression statement from C:

Stmt --> Expr ;
  { if (Expr.type != type_error)
      Stmt.type == void;
    else
      Stmt.type == type_error;
  }

Type checking of functions

Grammar for function call:

Expr --> Expr₁ ( Expr₂ )

Add function declarations to the declaration part, for example with the type syntax int -> int:

f : int -> int;
f(7) + 3;

Grammar for the function type:

Type --> Type₁ "->" Type₂

Translation scheme for the function type:

Type --> Type₁ "->" Type₂
  { Type.type = function(Type₁.type, Type₂.type); }

Translation scheme for type checking a function call:

Expr --> Expr₁ ( Expr₂ )
  { if (Expr₁.type == function(s, t) and Expr₂.type == s)
      Expr.type == t;
    else
      Expr.type == type_error;
  }

6.3.2 Equivalence of type expressions

Learn the difference between:

structural equivalence (two types are the same if they "look the same")
name equivalence (each named type is a distinct type)

More structural equivalence in C -- m1 and m2 have the same type:

double m1[14][17];
double m2[14][17];

More structural equivalence in C:

typedef int A[10];
typedef int B[10];
typedef int C[11];

int main() {
  A a;
  B b;
  C c;
  A* ap;
  B* bp;
  C* cp;

  ap = &a;
  ap = &b;
  ap = bp;

  ap = cp; /* warning: assignment from incompatible pointer type */
  ap = &c; /* warning: assignment from incompatible pointer type */

  return 0;
}

Checking type equivalence:

structural equivalence: recurse down to basic types (int, char, etc.)
name equivalence: just compare the names

C uses structural equivalence for all types except records ("structs"):

struct A { int foo; double fum; };
struct B { int blaj; double urko; };

struct A a;
struct B b;

a = b; /* error: incompatible types in assignment */

This is to aviod cycles in the type representation:

struct L {
  int foo;
  struct L* next;
};

But note that TA1 and TA2 are the same type (since both are A):

typedef struct A TA1;
typedef struct A TA2;

TA1 ta1;
TA2 ta2;

ta1 = ta2; // ok

The course Compilers and interpreters | Lectures: 1 2 3 4 5 6 7 8 9 10 11 12

Thomas Padron-McCarthy (thomas.padron-mccarthy@oru.se), August 29, 2022