These lecture notes are my own notes that I made in order to use during the lecture, and it is approximately what I will be saying in the lecture. These notes may be brief, incomplete and hard to understand, not to mention in the wrong language, and they do not replace the lecture or the book, but there is no reason to keep them secret if someone wants to look at them.
Idag: Yacc. Yacc-grammatikfiler. Att bygga parse-träd.
%{ #include <stdio.h> #include "global.h" extern int tokenval; extern void yyerror(char*); %} %token DONE ID NUM DIV MOD %% start: list DONE ; list: expr ';' list | /* empty */ ; expr: expr '+' term { printf("+"); } | term ; term: term '*' factor { printf("*"); } | term MOD factor { printf("MOD"); } | factor ; factor: '(' expr ')' | ID { printf("%s", symtable[tokenval].lexptr); } | NUM { printf("%d", tokenval); } ; %% void yyerror(char *s) { fprintf(stderr, "%s\n", s); } int yylex(void) { return lexan(); } void parse() { yyparse(); }
Ett fullständigt program. Bygg med t ex bison och gcc. (Eller g++.)
%{ #include <stdlib.h> /* Required to compile with C++ */ #include <stdio.h> #include <ctype.h> extern int yyparse(); /* Required to compile with C++ */ extern void yyerror(char*); /* Required to compile with C++ */ extern int yylex(void); /* Required to compile with C++ */ %} %token DIGIT %% line: expr '\n' { printf("%d\n", $1); } ; expr: expr '+' term { $$ = $1 + $3; } | term ; term: term '*' factor { $$ = $1 * $3; } | factor ; factor: '(' expr ')' { $$ = $2; } | DIGIT ; %% int yylex(void) { int c; c = getchar(); if (isdigit(c)) { yylval = c - '0'; return DIGIT; } return c; } void yyerror(char *s) { fprintf(stderr, "%s\n", s); } int main() { yyparse(); return 0; }
"$-grejer" (dvs attribut!):
"{ $$ = $1; }" är default:factor : '(' expr ')' { $$ = $2; } expr : expr + expr { $$ = $1 + $3; }
factor: DIGIT { $$ = $1; }
Scannern i 2.9-programmet använder variabeln tokenval för att ange det lexikaliska värdet av en hittad token (dvs, vilket tal var det, vilken identifierare). Lex använder variabeln yylval, och det är också där som Yacc tittar för att hitta $-värdet på en token.
Som default är $-värdena heltal. Det går att ange andra datatyper än bara heltal, även flera olika i samma parser! Olika tokens och olika icke-terminaler kan ha olika typ på sitt $-resultat ($$). I deklarationsdelen av grammatikfilen:
%union { double dval; struct TreeNode* ptr; } %type <ptr> expr term factor %type <dval> num %token <dval> NUMBER
%error-verbose
Declaring precedence and associativity:%left '+' '-' %left '*' '/' %right UMINUS ... expr : expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { $$ = $1 * $3; } | '(' expr ')' { $$ = $2; } | '-' expr %prec UMINUS { $$ = -$2; }
Some of the following examples are adapted from Thomas Niemann: A Compact Guide to Lex & Yacc.
Right recursion:list: item | list ',' item ;
list: item | item ',' list ;
Yacc does the right thing, but gives a warning about a shift/reduce conflict. Use precedence to avoid the warning:stmt: IF expr stmt | IF expr stmt ELSE stmt | ...
%nonassoc IFX %nonassoc ELSE stmt: IF expr stmt %prec IFX | IF expr stmt ELSE stmt | ...
If nothing else matches, the token error will match everything until the first of (in this case) semicolon or right curly bracket.void yyerror(char *s) { fprintf(stderr, "line %d: %s\n", yylineno, s); }
stmt: ';' | expr ';' | PRINT expr ';' | VARIABLE '=' expr '; | WHILE '(' expr ')' stmt | IF '(' expr ')' stmt %prec IFX | IF '(' expr ')' stmt ELSE stmt | '{' stmt_list '}' | error ';' | error '}' ;
expr: expr '+' expr { $$ = $1 + $3; };
decl: type varlist; type: INT | FLOAT; varlist: VAR { setType($1, $0); } | varlist ',' VAR { setType($3, $0); } ;
Se även Matz Kindahls exempel.
Inte så bra som man tror!list: item1 { do_item1($1); } item2 { do_item2($3); } item3
Later:
C++:#define MAX_ARGS 4 enum TreeNodeType { IF, WHILE, PLUS, MINUS, TIMES }; struct TreeNode { enum TreeNodeType type; struct TreeNode* args[MAX_ARGS]; };
class ParseTreeNode { // ... }; class Stmt : public ParseTreeNode { // ... }; class If : public Stmt { ParseTreeNode* condition; ParseTreeNode* then_part; ParseTreeNode* else_part; };
The course Compilers and interpreters | Lectures: 1 2 3 4 5 6 7 8 9 10 11 12