KOI: Lab Exercise 3
Creating a parser with Yacc.
(Or actually with Bison, which is the version of Yacc that we will be using.
Yacc and Bison are very similar, but there are some differences.)
Resources:
Yacc and Bison
The original version of Yacc created a C file called y.tab.c,
with compilable C code,
and a header file called y.tab.h, with definitions of token codes.
With Bison,
you need to use the command-line argument "-d" to get a ".h" file.
Also, Bison doesn't use the fixed names y.tab.c and y.tab.h,
but instead uses the same base name as that of the input file,
with .tab.c and .tab.h appended.
For example, the command
bison -d dumscript.y
will create the files dumscript.tab.c and dumscript.tab.h.
Yacc is very often used together with a scanner generator called Lex,
and Bison is very often used together with a scanner generator called Flex.
Flex is very similar to Lex.
Bison and Flex in various environments
Some Unix and Unix-like systems, such as various Linux distributions, come with these tools already installed.
On Ubuntu 13.04, you need to install them, which can be done with the command sudo apt-get install bison flex.
You can then run them using the commands bison and flex in a command shell.
On Windows, you have to download and install them separately.
If you are managing your own Windows machine,
you can download and then install a current version of
Bison for Windows.
At the same time, you probably want to download and install
Flex for Windows.
Here are some local copies from 2013-09-12:
(Warning:
The path to the directory where you install Flex and Bison shold not contain any spaces!)
Some hints on how to get Flex and Bison to work with Visual Studio 2012:
-
Create a folder for your work,
and put your Flex and Bison source files
(typically called something.lex and something.y)
there.
-
Open a command prompt, change directory to the folder mentioned above,
and verify that you can run Flex and Bison using these commands:
bison -d something.y
flex something.lex
Bison should generate the C files
something.tab.c and something.tab.h,
and Flex should generate the C file
lex.yy.c.
-
Create a normal C console project in Visual Studio.
(More help
here.)
-
Add your Flex and Bison source files to the project
so you can edit them using Visual Studio.
Drag and drop the Flex and Bison source files onto
Source Files in Solution Explorer.
This will not make copies of the files,
but instead add the actual files, in the folder mentioned above, to the project.
-
The C files that Flex and Bison generate must also be added to your project.
Right-click on the project name in Solution Explorer, and chose Add and then New Filter.
Call it Generated Files.
Drag and drop the generated files
(something.tab.c, something.tab.h and lex.yy.c)
onto Generated Files.
As above, this will not make copies of the files,
but instead add the actual files, still in the folder mentioned above, to the project.
-
Build the project and start the program.
Troubleshooting:
-
If Windows doesn't recognize the commands flex and bison,
verify that they really are installed, and add the path to that directory to the PATH variable.
-
The path to the directory where flex and bison are installed shold not contain any spaces.
-
If Visual Studio 2012 gives an error message anbout the function fileno
in the C code for the Flex-generated scanner,
you can add the line
#pragma warning(disable : 4996)
in the C header section between the lines %{ och %} at the beginning of the Lex file
(something.lex).
-
In some configurations you may have to set the environment variable TEMP to a directory
where you have write permissions,
such as C:/Temp/ (note the trailing slash).
You must re-run Bison and Flex whenever you have made changes to their input files.
This will generate new versions of the C files,
which Visual Studio will detect:
Answer Yes to All.
To build a project that contains Bison or Flex specifications,
you can run the commands by hand,
and then compile and link the C files,
but you should automate this process as much as you can.
A simple script ("batch file") may be enough, but more advanced methods are available.
On Unix-like systems such as Linux you can use make to automate the build process.
In Visual Studio 2005 and 2008 there was a relatively simple way to do something similar
(see Custom building and code generators in Visual Studio 2005),
but in later versions of Visual Studio it is much more complicated.
But you can use a simple script instead.
Or you can run the commands by hand,
but then you must remember to do it when you have changed the source files.
Tillägg 8 oktober 2013:
Bison och Flex finns installerade i datorsalarna,
med sökvägen till dem i PATH-variabeln,
så nu ska de fungera att köra.
Tillägg 8 oktober 2013:
Ett alternativt sätt att få Bison och Flex att köras automatiskt
i Visual Studio 2012 är att gå in under fliken
PROJEKT,
välja
projektnamn Properties...,
sen
Configuration Properties
och
Build Events,
och sen lägga in kommandona som
Pre-Build Event.
Another option:
winflexbison
Part A: Bison and Flex
Try out a simple example, which implements a calculator that understands
simple expressions.
Download the parser specification
example.y
and the scanner specification
example.lex,
or both in
example.zip.
Unpack, study the files, make a project, compile, and run it.
(You can either look in the file
Makefile
for commands to give,
ur you can set up rules in Visual Studio.)
- Which types of expressions does the calculator understand?
- One common operator is missing. Which one?
- Add the missing operator!
Part B: The calculator
Replace your hand-coded parser
in the 2.9 program
from lab exercise 2
with a Yacc-generated parser.
The program should still generate postfix output and calculate the result.
With a hand-coded parser, it was difficult to handle both assignments and expressions.
Let your Yacc grammar handle both, and see how easy it is!
Some more things to do:
-
You can declare yyerror in the "definitions" part of
the Yacc input file: extern void yyerror(char*);
-
We must also define yyerror somewhere in the program,
and a good place to do that is in the "subroutines" part of the Yacc input file.
-
Yacc expects the scanner to be called yylex,
since this is the name of the function generated by Lex and Flex.
Define such a function (again, in the "subroutines" part of the Yacc input file),
and let it call the old scanner function, lexan.
-
The parser function generated by Yacc will be called yyparse,
but the program in which we will plug it in
expects the parser function to be called parse.
One way to handle this is to define a parse function,
which just calls yyparse.
-
Yacc generates its own token codes for ID, NUM etc.
Obviously, the scanner must use the same token codes, and not the
(different) ones that we used before.
Therefore you should change the header file global.h,
and replace the old definitions of token codes
with an #include of the Yacc-generated header file something.tab.h.
There is a sample Yacc input file that (sort of) works with the 2.9 program
in the lecture notes for
lecture 5.
Part C: More operators
Implement the following operators from C and C++, in the grammar,
in the postfix translator, and in the calculator:
- % (a synonym for mod)
- & (bitwise and)
- | (bitwise or)
- <
- >
- ?: (as in expr1 ? expr2 : expr3)
Multi-character operators, such as == and ++,
would require changing the scanner, so for now we wait with those.
In C and in C++, the ?: operator only evaluates one of the expressions
expr2 and expr3.
If you want, you can instead let your operator evaluate both expr2 and expr3.
Report
Show your results and discuss them with the teacher,
or,
send an
e-mail
with clear and full explanations of what you have done.
(Send your e-mail in plain text format, not as HTML or Word documents.
Do not use attachments.)
Include the source code, with your changes clearly marked.
Even if you don't send a report by e-mail,
we advise that you write down your answers,
to facilitate communication and for your own later use.
Thomas Padron-McCarthy
(thomas.padron-mccarthy@oru.se)
October 30, 2013