DBS201- Lesson 6

0. Agenda

1- Recognizing the primary key (single attribute)

2- Recognizing the primary key (two attributes)

3- Define Functional Dependence

4- Examples of Functional Dependence upon a single attribute primary key.

5- Examples of Functional Dependence upon a concatenated primary key.

1. Recognizing the primary key (single attribute)

- The normalization process enables you to identify the existence of potential problems, called data anomalies (can't insert data or lose data).

- To correct anomalies in a database, you must convert tables into various types of normal forms.

- The most common normal forms are first normal form (1NF), second normal form (2NF), third normal form (3NF).

- Normalization is a progression in which a table that is in first normal form is better than a table that is in UNF, a table that is in second normal form is better than a table in first normal form, and so on.

In the following situation which column is the primary key? (Hint: the value in this column must be different in each row of data.)

AttribA AttribB AttribC AttribD AttribE AttribF

QQQO WWWO EEEO RRRO TTJO YVYO

QQOQ WWVO FEFO RRPO TJTO YYVO QOQQ WVWO FEEO RPRO JTTO VYYO

QOOQ VWWO EFEO PRRO JJTO VYVO

QOQO WWWO FEFO PPRO JTJO YYYO

QQQO VWVO EEFO PPRO TJJO VYVO

:


Answer: Only attribE has a different value in each row, so the primary key is AttribE

The values are: TTJO, TJTO, JTTO, JJTO, JTJO, and TJJO.

AttribA AttribB AttribC AttribD AttribE AttribF

QQQO WWWO EEEO RRRO TTJO YVYO

QQOQ WWVO FEFO RRPO TJTO YYVO QOQQ WVWO FEEO RPRO JTTO VYYO

QOOQ VWWO EFEO PRRO JJTO VYVO

QOQO WWWO FEFO PPRO JTJO YYYO

QQQO VWVO EEFO PPRO TJJO VYVO

We write the relation as:

(AttribE, AttribA, AttribB, AttribC, AttribD, AttribF)

Another example: A Person keeps track of Race Horses and Dogs

A B C D E .

Horse 2 years $500,000 Mohawk Race Track Lightening Rod

GreyHound 3 years $125,000 Tampa Florida Speckled Speed

Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry

Greyhound 4 years $125,000 Hialia Atlantic City All Leggs

Answer:

Column A eliminated! Why?

Column B possible primary key! Why?

Column C eliminated! Why?

Column D possible primary key! Why?

Column E possible primary key! Why?

Columns that qualify as possible primary keys are called candidate keys.

The candidate keys are:

______

______

______

Of the above 3 candidate keys, which is the best selection as primary key?

______

2- Recognizing the primary key (two attributes)

We will change our example slightly so that our selection of primary key no longer works.

A B C D E .

Horse 2 years $500,000 Mohawk Race Track All Leggs

GreyHound 3 years $125,000 Tampa Florida Speckled Speed

Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry

Greyhound 4 years $145,000 Hialia Atlantic City All Leggs

To identify each row with a unique primary key, the animal's name is not enough.

It would be possible for a GreyHound to be called “All Leggs”, resulting in a duplicate Primary Key value.

The best way to get a unique row identifier (PK) is to use two attributes concatenated together.

List the possible answers: ______+______

______+______

______+______

______+______

Which two attributes are best? ______

These are the best choice for primary key because each PK will not be re-used in another row.


2- Recognizing the primary key - two attributes (continued)

Example of Hockey Awards

Award PlayerName PNumber Position Team Year

Best Defense Joe Wall 17 Left Defense Toronto 1999

Sy Stopp 7 Right Defense Detroit 2000

Pete Puck 22 Left Defense Montreal 2001

Joe Wall 17 Left Wing Toronto 2002

MostValuable Sam Scores 18 Center Chicago 1999

Wayne Gret 99 Center New York 2000

Joe Wall 17 Left Wing Toronto 2002


Answer:

1. Step 1: Fill in all attributes:

Award PlayerName PNumber Position Team Year

Best Defense Joe Wall 17 Left Defense Toronto 1999

Best Defense Sy Stopp 7 Right Defense Detroit 2000

Best Defense Pete Puck 22 Left Defense Montreal 2001

Best Defense Joe Wall 17 Left Wing Toronto 2002

MostValuable Sam Scores 18 Center Chicago 1999

MostValuable Wayne Gret 99 Center New York 2000

MostValuable Joe Wall 17 Left Wing Toronto 2002

2. Step 2: Look for any column that has no value that is used in more than one row. We are looking for a column for which it’s value is UNIQUE. We first check to see if we can have a single attribute primary key.

No one column meets this criterion.

3. Step 3: Look for any pairs of columns which when concatenated produce a unique value.

Award PlayerName PNumber Position Team Year

Team + Year…………
Position + Team……..
Position + Year………
PNumber + Position….
PNumber + Team…….
PNumber + Year……..
/ PlayerName + PNumber …
PlayerName + Position……
PlayerName + Team………
PlayerName + Year……….
Award + PlayerName……..
Award + PNumber………...
Award + Position………….
Award + Team…………….
Award + Year……………..

Award PlayerName PNumber Position Team Year

Best Defense Joe Wall 17 Left Defense Toronto 1999

Best Defense Sy Stopp 7 Right Defense Detroit 2000

Best Defense Pete Puck 22 Left Defense Montreal 2001

Best Defense Joe Wall 17 Left Wing Toronto 2002

MostValuable Sam Scores 18 Center Chicago 1999

MostValuable Wayne Gret 99 Center New York 2000

MostValuable Joe Wall 17 Left Wing Toronto 2002

So Award + Year will be selected as the PK.

and the relation will be:

(Award, Year, PlayerName, PNumber, Position, Team)

3. Define Functional Dependence.

Functional dependence is a formal name for what is basically a simple idea. A column (attribute), B, is functionally dependent on another column, A (or possibly a collection of columns), if a value for A determines a single value for B at any one time.

From our previous example

Type Age Winnings HomeRaceTrack Name .

Horse 2 years $500,000 Mohawk Race Track Lightening Rod

GreyHound 3 years $125,000 Tampa Florida Speckled Speed

Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry

Greyhound 4 years $125,000 Hialia Atlantic City All Leggs

In this example we had selected only the Name as PK. This means that: if we know the Name, we can know the exact value of the other attributes.

- Type is functionally dependant on Name. Name functionally determines Type.

- Age is functionally dependant on Name. Name functionally determines Age.

- Winnings functionally dependant on Name. Name functionally determines Winnings.

etc…

We can draw a picture of functional dependency:

Name / Type / Age / Winnings / Home Race Track

Functional Dependency with a two part PK

(Award, Year, PlayerName, PNumber, Position, Team)

Award / Year / Player
Name / PNumber / Position / Team