Kompilatorer och interpretatorer: Lecture 5

Note: This is an outline of what I intend to say on the lecture. It is not a definition of the course content, and it does not replace the textbook.

Today: More about Yacc. Building parse trees.
Aho et al, section 4.9. KP p 77-84. Thomas Niemann: A Compact Guide to Lex & Yacc (only the Yacc parts).

4.9 Parser generators

Yacc.

Some repetition from last time:

Example Yacc input file (download):
%{
  #include "global.h"
  extern int tokenval;
  extern void yyerror(char*);
%}

%token DONE ID NUM DIV MOD

%%

start: list DONE

list: expr ';' list
        | /* empty */
        ;

expr: expr '+' term { printf("+"); }
       | term
       ;

term: term '*' factor { printf("*"); }
       | term MOD factor { printf("MOD"); }
       | factor
       ;

factor: '(' expr ')'
       | ID { printf("%s", symtable[tokenval].lexptr); }
       | NUM { printf("%d", tokenval); }
       ;

%%

void yyerror(char *s) {
    fprintf(stderr, "%s\n", s);
}

int yylex(void) {
  return lexan();
}

void parse() {
  yyparse();
}

A calculator that calculates

ASU Fig. 4.56. (Modified.) (Download)

A complete program. bison + cc. (Or: bison + g++)

%{
  #include <stdlib.h> /* Required to compile with C++ */
  #include <stdio.h>
  #include <ctype.h>
  extern int yyparse(); /* Required to compile with C++ */
  extern void yyerror(char*); /* Required to compile with C++ */
  extern int yylex(void); /* Required to compile with C++ */
%}

%token DIGIT

%%

line:	expr '\n' { printf("%d\n", $1); }
	;

expr:	expr '+' term { $$ = $1 + $3; }
	| term
	;

term:	term '*' factor { $$ = $1 * $3; }
	| factor
	;

factor:	'(' expr ')' { $$ = $2; }
	| DIGIT
	;

%%

int yylex(void) {
  int c;
  c = getchar();
  if (isdigit(c)) {
    yylval = c - '0';
    return DIGIT;
  }
  return c;
}

void yyerror(char *s) {
  fprintf(stderr, "%s\n", s);
}

int main() {
  yyparse();
  return 0;
}

yylval:

factor : '(' expr ')' { $$ = $2; }
expr : expr + expr { $$ = $1 + $3; }
"{ $$ = $1; }" is the default.

Priority for an ambiguous grammar

From ASU Fig. 4.57:
%token NUMBER
%left '+' '-'
%left '*' '/'
%right UMINUS

...

expr :   expr '+' expr { $$ = $1 + $3; }
       | expr '-' expr { $$ = $1 - $3; }
       | expr '*' expr { $$ = $1 * $3; }
       | expr '/' expr { $$ = $1 * $3; }
       | '(' expr ')' { $$ = $2; }
       | '-' expr %prec UMINUS { $$ = -$2; }
Declaring precedence and associativity:

Some of the following examples are adapted from Thomas Niemann: A Compact Guide to Lex & Yacc.

Recursion

Left recursion (more efficient in Yacc):
list: 
      item 
      | list ',' item 
      ; 
Right recursion:
list: 
      item 
      | item ',' list 
      ;

The if-else shift/reduce conflict

stmt:
      IF expr stmt
      | IF expr stmt ELSE stmt
      | ...
Yacc does the right thing, but gives a warning about a shift/reduce conflict. Use precedence to avoid the warning:
%nonassoc IFX 
%nonassoc ELSE 

stmt:  
      IF expr stmt %prec IFX 
      | IF expr stmt ELSE stmt 
      | ...

The error token

void yyerror(char *s) { 
      fprintf(stderr, "line %d: %s\n", yylineno, s); 
} 
If nothing else matches, the token error will match everything until the first of (in this case) semicolon or right curly bracket.
stmt: 
      ';' 
      | expr ';'
      | PRINT expr ';'
      | VARIABLE '=' expr ';
      | WHILE '(' expr ')' stmt
      | IF '(' expr ')' stmt %prec IFX
      | IF '(' expr ')' stmt ELSE stmt
      | '{' stmt_list '}'
      | error ';'
      | error '}'
      ;

Synthesized Attributes

expr: expr '+' expr       { $$ = $1 + $3; };

Inherited Attributes

decl: type varlist;
type: INT | FLOAT;
varlist:
      VAR                 { setType($1, $0); }
      | varlist ',' VAR   { setType($3, $0); }
      ;

Embedded Actions

list: item1 { do_item1($1); } item2 { do_item2($3); } item3 

Debugging Yacc

...

Later:

Building parse trees

Simple (abstract syntax) trees in C:
#define MAX_ARGS 4

enum TreeNodeType { IF, WHILE, PLUS, MINUS, TIMES };

struct TreeNode {
  enum TreeNodeType type;
  struct TreeNode* args[MAX_ARGS];
};
C++:
class ParseTreeNode {
  // ...
};

class Stmt : public ParseTreeNode {
  // ...
};

class If : public Stmt {
  ParseTreeNode* condition;
  ParseTreeNode* then_part;
  ParseTreeNode* else_part;
};


Thomas Padron-McCarthy (Thomas.Padron-McCarthy@tech.oru.se) February 5, 2003