Database Management Systems

Fall 2005 CS 370 / IT 376 Exam 2 Page 5

Database Management Systems

11/7/05

Name______

1. True/False.

[15 pts]

______Statement level interface for embedded SQL requires a compiler preprocessor and extensions to the syntax of the language.

______A call level interface for embedded SQL does not permit reference to host language (e.g. Java) variables.

______Embedded SQL in Java we used through the java.sql.* package is an example of a call level interface.

______The SQLSTATE system variable is a 5 digit string that contains standard return codes.

______A Tetabyte (a trillion bytes) is considered a medium size for a database.

______Disk accesses are roughly a thousand times slower than RAM access.

______ODBC is a database access protocol that is for object oriented languages only.

______A cylinder on a disk is composed of all the tracks of a platter.

______A heap or pile file maintains the data in order by its primary key.

______A table’s tuples stored with a B+-tree index will be clustered by the primary key.

______An object oriented database schema may define public and private methods in addition to attributes and relationships.

______Person ISA sub-entities such as (staff, student, faculty) are examples of mutually exclusive specialization.

______Extendible hashing eliminates the need to rehash a file to accommodate a larger, growing data set.

______CASCADE DELETE is appropriate on a foreign key when referencing from a weak entity.

______INSERTing into a view based on a join is acceptable.

2. Give two interpretations of the “null” value in a relation. What does normalization say about null values?

[5 pts]

3. Explain what a logical transaction is. How is it started? How is it terminated?

[4 pts]

4. What is a lossless join decomposition?

[4 pts]

For the next two questions, use the following bookstore database schema. Keys are underlined. ISBN is the unique book ID number. The rest of the attributes should be self-evident. If not, please ask for clarification.

CUSTOMER (CID, Name, Street, City, State, Zip, CreditLimit)

BOOK(ISBN, Title, Author, CopiesOnHand, Price, Publisher, PublisherFax)

PURCHASED(ISBN,CID, DatePurchase, Copies)

BACKORDER(ISBN, CID, Copies, HomePhone)

5. There are at least two relations above that have functional dependencies on either a non-key or sub-key. Identify the offending functional dependencies by drawing arrows above, and normalize the offending relations so that the resulting relations are in 3NF. Show the new relations below.

[9 pts]

6. Explain what the following complementing SQL triggers should accomplish. Any syntax errors are not intended and there are some extensions to the grammar.

[6 pts]

CREATE TRIGGER COH1

AFTER INSERT ON PURCHASED

REFERENCING NEW AS P

FOR EACH ROW

WHEN (P.Copies>=(SELECT B.CopiesOnHand FROM BOOK

WHERE P.ISBN = BOOK.ISBN))

UPDATE BOOK

SET CopiesOnHand = CopiesOnHand – P.Copies

WHERE P.ISBN = BOOK.ISBN;

CREATE TRIGGER COH2

AFTER INSERT ON PURCHASED

REFERENCING NEW AS P

FOR EACH ROW

WHEN (P.Copies<(SELECT B.CopiesOnHand FROM BOOK

WHERE P.ISBN = BOOK.ISBN))

UPDATE BOOK

SET CopiesOnHand = 0

WHERE P.ISBN = BOOK.ISBN;

INSERT INTO BACKORDER VALUES(P.ISBN,P.CID,

(P.Copies – (SELECT B.CopiesOnHand FROM BOOK

WHERE P.ISBN = BOOK.ISBN))

);

7. The following objects and are crucial components to successfully accessing a database from Java or in any embedded SQL setting. Give a brief explanation in general of what each of the objects contain or accomplish.

[12 pts]

a. Driver:

b. Connection:

c. PreparedStatement:

d. ResultSet:

8. Disk drives.

a. If a disk has sectors that are 1024 bytes, describe how the data in a large relational table would likely be broken up.

[4 pts]

b. Of the three components of a disk’s access time (seek, latency, transfer), which takes the longest? Give an estimate for the typical access time.

[3 pts]

c. To process a sector from part a, how much memory would have to be allocated for a double buffer so that the next sector can be read while the current one is being processed?

[2 pts]

9. When is the benefit of using hashing to store and retrieve a tuple in a file over a B+-tree index?

[8 pts]

Conversely, when would a B+-tree index be desirable to have over hashing?

10. Construct the index nodes for the B+-tree below. The number of values that will fit in one node is 3. For degree = 4, you normally have index nodes with at least two keys in it. For level 1, assume that you have two nodes, the first nodes indexes the left three data nodes and the second node indexes the right three. Then the root node indexes the two nodes at level 1 (hint: be sure you select your key value from the data node)

[8 pts]

Root level

Level 1:

Data Nodes: (2-3-5) (9-13) (17-19-23) (28-35-38) (40-43-45) (50-54)

Now show how the B+-tree appears after inserting the value 44.

For the SQL queries, use the following relational schema for a professional baseball team database. Keys are underlined. Example, albeit inaccurate, data are given. The attributes should be self-evident. If not, please ask for clarification. It is simplified from the previous exam.

Team(TName, League, Area, Owner, City, Stadium)
(‘Yankees’,’American’,’East’,’Steinbrenner’ , ‘New York’, ‘Yankee’)

PERSON(ID, Name, Address, Salary, DOB)
(2833,’Randy Johnson’, NULL, 2000000, ’1963-09-10’)
(399, ‘Joe Torre’, NULL, 1500000, ‘1941-03-02’)

MEMBER(ID, TName, StartDate, ReleaseDate, Position, MemberType, Number)
(2833,’Yankees’,’1995-01-01’, NULL, ‘Pitcher’,’Player’, 41)

(399, ‘Yankees’, ‘1994-01-01’, NULL, ‘Manager’, ‘Staff’,NULL)

GAME(Date, HomeTeam, VisitingTeam, HomeScore, VisitorScore, InningsPlayed)
(‘2005-10-01’,’Red Sox’, ‘Yankees’, 4,8,9)

Syntax for SQL, where [] means optional, {op1|op2|...} means choice

SELECT [DISTINCT] {* | attribute-list | aggregate functions}...

FROM tables-list and aliases

WHERE condition

[GROUP BY attr [HAVING condition]]

SQL conditions consist of <,>,<=,>=, >,=, AND, OR, BETWEEN value AND value
[NOT] EXISTS ({list | SELECT...}),
rel-op {ANY|SOME|ALL} ({ list | SELECT...}), IS [NOT] NULL

Aggregate functions: COUNT(*), MIN(attr), MAX(attr), SUM(attr), AVG(attr)

11. Give the SQL select statements for the following queries.

[20 pts]

a) List all players and team names who currently earn more than a million dollars.

SELECT P.Name,, P.Salary, M.TName

FROM PERSON P, MEMBER M

WHERE P.Salary>1000000 AND M.ReleaseDate=Null

AND P.ID = M.ID AND M.MemberType=’Player’

b) List all the pitchers’ names who currently play for teams in New York City. Assume you do not know which teams they are to construct the query.

SELECT P.Name, M.TName

FROM PERSON P, MEMBER M, TEAM T

WHERE T.City = ‘New York City’ AND T.TName=M.TName

AND M.Position=’Pitcher’ AND M.ID=P.ID AND M.ReleaseDate=Null

c) List current staff members who were never players. (Hint use a subselect)

SELECT P.Name

FROM PERSON P, MEMBER M

WHERE M.MemberType=’Staff’ AND M.ReleaseDate=Null AND P.ID=M.ID

AND M.ID NOT IN ( SELECT N.ID

FROM MEMBER P

WHERE M.MemberType=’Player’)

d) Display the average salaries of current players for each team.

SELECT AVG(P.Salary), M.TName

FROM PERSON P, MEMBER M

WHERE M.ReleaseDate=Null

AND M.ID=P.ID

GROUP BY M.TName

e) Who has/had the largest salary and for which team? (Hint use a subselect)

SELECT P.Name, M.TName, P.Salary

FROM MEMBER M, PERSON P

WHERE P.ID = M.ID

AND P.Salary >= (SELECT Salary FROM PERSON)