University of Linköping
Department of Computer Science
Tore Risch/Thomas Padron-McCarthy


                           Exam in

                    TDDB38 Database Technology
                    --------------------------

                  Saturday October 18, 1997, 9-13


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Utilities:      None

Points:         Maximally 30 points.
                15 points required to pass.

Results:        Will be posted in IDA's exam results room
                (on the ground floor in the northern part of the E building).

Walk thru:      The time will be posted along with the result.
    
Teacher
on duty:        Tore Risch, phone 070-6545231
                Thomas Padron-McCarthy, phone 070-7493646


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


                        Instructions
                        ------------

Read through the entire exam and make notes of eventual obscurenesses
before you start solving the tasks. In addition to the instructions on
the cover of the test the following holds:

Write readably and clearly. Solutions that cannot be read will of
course not get any points, and unclear sentences will be misunderstood.

Additional assumptions to those in each task must be stated. Stated
assumptions may of course not modify the given task.


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


                        GOOD LUCK!

Scenario for questions 1-4:

The Swedish Apple Research Institute in Linköping does research about
apple growing. The institute has a large number of apple trees,
of different sorts, and collects data about these. Each individual
apple is examined and weighed. The data for a certain year is then
stored in a database.
("Apple sort" and "tree sort" is the same thing.)

An E/R diagram for the database looks like this:

1. (4p) Implement this database in the relational model,
        i. e. translate the E/R diagram to tables.
        State the relations, the attributes in each relation,
        and the primary key.
        The implementation should be a good implementation.

        (Also look at question number 3 below,
        so that you can ask these questions in your solution!)


2. (3p) There is about a hundred apple sorts, thens of thousands of trees,
        and each tree can have hundreds of apples.
        Do realistic assumptions about which queries that will be used,
        state these assumptions, and then decide suitable storage structures
        (including secondary indexes) for the tables.


3. (4p) Formulate the following queries in SQL:

        a) At position x=98, y=47 there is an apple tree.
           Which tree sort does this tree belong to?
        b) The gardener "Sven Svensson" is responsible for a number of trees.
           Which tree sorts do these trees belong to?
        c) How much does, on average, an apple of the sort "Åkerö" weigh?
           (Hint: There is an aggregate function called "avg".)
        d) Which gardener has measured the largest number of apples?
           (You can, if you like, simplify by defining a view.)


4. (1p) Formulate query 3b above (about which tree sorts that the gardener
        "Sven Svensson" is responsible for) in relational algebra.

5. (5p) The flight carrier Struggle And Sorrow needs a new system for
        booking seats in order to improve their passenger service. The 
        seats on each flight are stored in a relation
             SEATBOOKINGS(FNO, DATE, SEAT, SMOKING, CLASS, PASSENGER)
        where FNO is the flight number, DATE is the day of the flight,
        SEAT is a seat number, SMOKING is 'y' for a seat with smoking allowed
        and 'n' otherwise, CLASS is 1 or 2, and PASSENGER is a space if
        the seat is unbooked, and the name of the passenger if the seat is 
        booked. The key is FNO, DATE, SEAT.                

        Implement, by using so called ECA triggers, a 'waiting list' where
        customers can register their seat requirements in case a suitable
        seat is already booked, and where the seat is booked as soon as
        a suitable seat is becoming available. Thereby it is also marked
        in the waiting list that the seat has become booked.
        The passenger should then be able to see in the waiting
        list which seat and flight s/he has received.

        The waiting list is a relation that you design yourself.
        Describe briefly how one registers on the waiting list.


6. (5p) What is replication?
        What is fragmentation?
        When is replication favourable?
        When is fragmentation favourable?
        How is the schema information stored in a distributed DBMS?


7. (4p) We have two relations:
          EMPLOYEE(SSN, NAME, SALARY, DNO, ADDRESS, PHONE)
          DEPARTMENT(DNO, DNAME, MGR, SECR, LOCATION)
        and the query:
        SELECT S.NAME FROM EMPLOYEE E, DEPARTMENT D, EMPLOYEE S
          WHERE E.NAME = "Kalle Karlsson" AND D.DNO = E.DNO AND
                D.SECR = S.SSN
        Translate the query to tuple relational calculus.
        Translate the query to domain relational calculus.


8. (4p) We have a large relation with many rows:
          SALESMEN(SSN, NAME, SALES, BONUS)
        One very often whishes to create a list of the 10 salesmen
        having the highest SALES figures (in order to increase their BONUS).
      - How should one design the database physically in order to efficiently
        support such top-10 queries?
      - What method do you use to retrieve the 10 best selling salesmen
        without retrieving all the rows in the relation? (in Standard SQL -
        not MS Access).
      - Explain briefly why the proposed method works.