Kompilatorer och interpretatorer: Lecture 7

Note: This is an outline of what I intend to say on the lecture. It is not a definition of the course content, and it does not replace the textbook.

Today: Some more about Lex.
Syntax-directed translation. Building syntax trees.
ASU 5.1-5.5.

Some more about Lex

String literals

Lex and Yacc together

Reserved Words

Rest from previous lecture?

(From Niemann.)

If your program has a large collection of reserved words, it is more efficient to let lex simply match a string, and determine in your own code whether it is a variable or reserved word. For example, instead of coding

"if"            return IF;
"then"          return THEN;
"else"          return ELSE;

{letter}({letter}|{digit})*  {
         yylval.id = symLookup(yytext);
         return IDENTIFIER;
     }
where symLookup returns an index into the symbol table, it is better to detect reserved words and identifiers simultaneously, as follows:
{letter}({letter}|{digit})*  {
         int i;

         if ((i = resWord(yytext)) != 0)
             return (i);
         yylval.id = symLookup(yytext);
         return (IDENTIFIER);
     }
This technique significantly reduces the number of states required, and results in smaller scanner tables.

5. Syntax-directed translation

Remember: syntax-directed translations associate semantic rules with the productions of a grammar. Two types:

Semantic rukes nat generate code, save information in a symbol table, issue error messages, or perform any other activities.

5.1 Syntax-directed definitions

Attributes = "record fields in the nodes of the parse tree".
Numbers, strings, pointers...

Attributes:

Annotating or decorating a parse tree = calculating attribute values

Attribute x depends on attribute y.
Ex: E.val depends on E1.val and E2.val

(A dependency graph shows dependencies between attributes in nodes, and can be used to determine the evaluation order of nodes.)

Attribute grammar = no side effects in the semantic rules

S-attributed definitions = a syntax definition with only synthesized (=S) attributes

Attribute values of terminals (=tokens) are supplied by the lexical analyzer. A token has no inherited attributes.

5.2 Construction of syntax trees

Remember: But: terminology!

We can generate code (initermediate or target) (or calculate a result in a calculator) directly while parsing...

...or we can build a syntax tree, and then generate code (or caluclate a result) from the tree.

Building syntax trees for expressions

Ex: a-4+c

ASU Fig. 5.9: Syntax-directed definition for constructing a syntax tree for an expression:

Production Semantic Rule
E -> E1 + T E.nptr = mknode('+', E1.nptr, T.nptr)
E -> E1 - T E.nptr = mknode('-', E1.nptr, T.nptr)
E -> T E.nptr = T.nptr
T -> ( E ) T.nptr = E.nptr
T -> id T.nptr = mkleaf(ID, id.entry)
T -> num T.nptr = mkleaf(NUM, num.entry)

(Remember from lecture 5:)

#define MAX_ARGS 4

enum TreeNodeType { IF, WHILE, PLUS, MINUS, TIMES };

struct TreeNode {
  enum TreeNodeType type;
  struct TreeNode* args[MAX_ARGS];
};

Building syntax trees for statements

Directed Acyclic Graphs (DAGs)

5.3 Bottom-up evaluation of S-attributed definitions

This section is about how Yacc handles it synthesized attributes, that is, the $$, $1, etc. internally.

Skip.

5.4 L-attributed definitions

Skip.

5.5 Top-down translations

Skip.

Later:


Thomas Padron-McCarthy (Thomas.Padron-McCarthy@tech.oru.se) February 10, 2003