University of Linköping Department of Computer Science Tore Risch/Thomas Padron-McCarthy Exam in TDDB38 Database Technology -------------------------- Saturday October 18, 1997, 9-13 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Utilities: None Points: Maximally 30 points. 15 points required to pass. Results: Will be posted in IDA's exam results room (on the ground floor in the northern part of the E building). Walk thru: The time will be posted along with the result. Teacher on duty: Tore Risch, phone 070-6545231 Thomas Padron-McCarthy, phone 070-7493646 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Instructions ------------ Read through the entire exam and make notes of eventual obscurenesses before you start solving the tasks. In addition to the instructions on the cover of the test the following holds: Write readably and clearly. Solutions that cannot be read will of course not get any points, and unclear sentences will be misunderstood. Additional assumptions to those in each task must be stated. Stated assumptions may of course not modify the given task. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ GOOD LUCK! Scenario for questions 1-4: The Swedish Apple Research Institute in Linköping does research about apple growing. The institute has a large number of apple trees, of different sorts, and collects data about these. Each individual apple is examined and weighed. The data for a certain year is then stored in a database. ("Apple sort" and "tree sort" is the same thing.) An E/R diagram for the database looks like this: 1. (4p) Implement this database in the relational model, i. e. translate the E/R diagram to tables. State the relations, the attributes in each relation, and the primary key. The implementation should be a good implementation. (Also look at question number 3 below, so that you can ask these questions in your solution!) 2. (3p) There is about a hundred apple sorts, thens of thousands of trees, and each tree can have hundreds of apples. Do realistic assumptions about which queries that will be used, state these assumptions, and then decide suitable storage structures (including secondary indexes) for the tables. 3. (4p) Formulate the following queries in SQL: a) At position x=98, y=47 there is an apple tree. Which tree sort does this tree belong to? b) The gardener "Sven Svensson" is responsible for a number of trees. Which tree sorts do these trees belong to? c) How much does, on average, an apple of the sort "Åkerö" weigh? (Hint: There is an aggregate function called "avg".) d) Which gardener has measured the largest number of apples? (You can, if you like, simplify by defining a view.) 4. (1p) Formulate query 3b above (about which tree sorts that the gardener "Sven Svensson" is responsible for) in relational algebra. 5. (5p) The flight carrier Struggle And Sorrow needs a new system for booking seats in order to improve their passenger service. The seats on each flight are stored in a relation SEATBOOKINGS(FNO, DATE, SEAT, SMOKING, CLASS, PASSENGER) where FNO is the flight number, DATE is the day of the flight, SEAT is a seat number, SMOKING is 'y' for a seat with smoking allowed and 'n' otherwise, CLASS is 1 or 2, and PASSENGER is a space if the seat is unbooked, and the name of the passenger if the seat is booked. The key is FNO, DATE, SEAT. Implement, by using so called ECA triggers, a 'waiting list' where customers can register their seat requirements in case a suitable seat is already booked, and where the seat is booked as soon as a suitable seat is becoming available. Thereby it is also marked in the waiting list that the seat has become booked. The passenger should then be able to see in the waiting list which seat and flight s/he has received. The waiting list is a relation that you design yourself. Describe briefly how one registers on the waiting list. 6. (5p) What is replication? What is fragmentation? When is replication favourable? When is fragmentation favourable? How is the schema information stored in a distributed DBMS? 7. (4p) We have two relations: EMPLOYEE(SSN, NAME, SALARY, DNO, ADDRESS, PHONE) DEPARTMENT(DNO, DNAME, MGR, SECR, LOCATION) and the query: SELECT S.NAME FROM EMPLOYEE E, DEPARTMENT D, EMPLOYEE S WHERE E.NAME = "Kalle Karlsson" AND D.DNO = E.DNO AND D.SECR = S.SSN Translate the query to tuple relational calculus. Translate the query to domain relational calculus. 8. (4p) We have a large relation with many rows: SALESMEN(SSN, NAME, SALES, BONUS) One very often whishes to create a list of the 10 salesmen having the highest SALES figures (in order to increase their BONUS). - How should one design the database physically in order to efficiently support such top-10 queries? - What method do you use to retrieve the 10 best selling salesmen without retrieving all the rows in the relation? (in Standard SQL - not MS Access). - Explain briefly why the proposed method works.