KOI: Lab Exercise 3

Creating a parser with Yacc. (Or actually with Bison, which is the version of Yacc that we will be using. Yacc and Bison are very similar, but there are some differences.)


Yacc and Bison

The original version of Yacc created a C file called y.tab.c, with compilable C code, and a header file called y.tab.h, with definitions of token codes. With Bison, you need to use the command-line argument "-d" to get a ".h" file. Also, Bison doesn't use the fixed names y.tab.c and y.tab.h, but instead uses the same base name as that of the input file, with .tab.c and .tab.h appended. For example, the command
bison -d dumscript.y
will create the files dumscript.tab.c and dumscript.tab.h.

Yacc is very often used together with a scanner generator called Lex, and Bison is very often used together with a scanner generator called Flex. Flex is very similar to Lex.

Bison and Flex in various environments

Some Unix and Unix-like systems, such as various Linux distributions, come with these tools already installed. On Ubuntu 13.04, you need to install them, which can be done with the command sudo apt-get install bison flex. You can then run them using the commands bison and flex in a command shell.

On Windows, you have to download and install them separately. If you are managing your own Windows machine, you can download and then install a current version of Bison for Windows. At the same time, you probably want to download and install Flex for Windows.

Here are some local copies from 2013-09-12:

(Warning: The path to the directory where you install Flex and Bison shold not contain any spaces!)

Some hints on how to get Flex and Bison to work with Visual Studio 2012:


You must re-run Bison and Flex whenever you have made changes to their input files. This will generate new versions of the C files, which Visual Studio will detect:

This file has been modified outside of the source editor. Do you want to reload it?

Answer Yes to All.

To build a project that contains Bison or Flex specifications, you can run the commands by hand, and then compile and link the C files, but you should automate this process as much as you can. A simple script ("batch file") may be enough, but more advanced methods are available. On Unix-like systems such as Linux you can use make to automate the build process. In Visual Studio 2005 and 2008 there was a relatively simple way to do something similar (see Custom building and code generators in Visual Studio 2005), but in later versions of Visual Studio it is much more complicated. But you can use a simple script instead. Or you can run the commands by hand, but then you must remember to do it when you have changed the source files.

Tillägg 8 oktober 2013:
Bison och Flex finns installerade i datorsalarna, med sökvägen till dem i PATH-variabeln, så nu ska de fungera att köra.

Tillägg 8 oktober 2013:
Ett alternativt sätt att få Bison och Flex att köras automatiskt i Visual Studio 2012 är att gå in under fliken PROJEKT, välja projektnamn Properties..., sen Configuration Properties och Build Events, och sen lägga in kommandona som Pre-Build Event.

Another option: winflexbison

Part A: Bison and Flex

Try out a simple example, which implements a calculator that understands simple expressions. Download the parser specification example.y and the scanner specification example.lex, or both in example.zip.

Unpack, study the files, make a project, compile, and run it. (You can either look in the file Makefile for commands to give, ur you can set up rules in Visual Studio.)

Part B: The calculator

Replace your hand-coded parser in the 2.9 program from lab exercise 2 with a Yacc-generated parser. The program should still generate postfix output and calculate the result.

With a hand-coded parser, it was difficult to handle both assignments and expressions. Let your Yacc grammar handle both, and see how easy it is!

Some more things to do:

There is a sample Yacc input file that (sort of) works with the 2.9 program in the lecture notes for lecture 5.

Part C: More operators

Implement the following operators from C and C++, in the grammar, in the postfix translator, and in the calculator: Multi-character operators, such as == and ++, would require changing the scanner, so for now we wait with those.

In C and in C++, the ?: operator only evaluates one of the expressions expr2 and expr3. If you want, you can instead let your operator evaluate both expr2 and expr3.


Show your results and discuss them with the teacher,
send an e-mail with clear and full explanations of what you have done. (Send your e-mail in plain text format, not as HTML or Word documents. Do not use attachments.) Include the source code, with your changes clearly marked.

Even if you don't send a report by e-mail, we advise that you write down your answers, to facilitate communication and for your own later use.

Thomas Padron-McCarthy (Thomas.Padron-McCarthy@oru.se) October 30, 2013