DBS201- Lesson 6
0. Agenda
1- Recognizing the primary key (single attribute)
2- Recognizing the primary key (two attributes)
3- Define Functional Dependence
4- Examples of Functional Dependence upon a single attribute primary key.
5- Examples of Functional Dependence upon a concatenated primary key.
1. Recognizing the primary key (single attribute)
- The normalization process enables you to identify the existence of potential problems, called data anomalies (can't insert data or lose data).
- To correct anomalies in a database, you must convert tables into various types of normal forms.
- The most common normal forms are first normal form (1NF), second normal form (2NF), third normal form (3NF).
- Normalization is a progression in which a table that is in first normal form is better than a table that is in UNF, a table that is in second normal form is better than a table in first normal form, and so on.
In the following situation which column is the primary key? (Hint: the value in this column must be different in each row of data.)
AttribA AttribB AttribC AttribD AttribE AttribF
QQQO WWWO EEEO RRRO TTJO YVYO
QQOQ WWVO FEFO RRPO TJTO YYVO QOQQ WVWO FEEO RPRO JTTO VYYO
QOOQ VWWO EFEO PRRO JJTO VYVO
QOQO WWWO FEFO PPRO JTJO YYYO
QQQO VWVO EEFO PPRO TJJO VYVO
:
Answer: Only attribE has a different value in each row, so the primary key is AttribE
The values are: TTJO, TJTO, JTTO, JJTO, JTJO, and TJJO.
AttribA AttribB AttribC AttribD AttribE AttribF
QQQO WWWO EEEO RRRO TTJO YVYO
QQOQ WWVO FEFO RRPO TJTO YYVO QOQQ WVWO FEEO RPRO JTTO VYYO
QOOQ VWWO EFEO PRRO JJTO VYVO
QOQO WWWO FEFO PPRO JTJO YYYO
QQQO VWVO EEFO PPRO TJJO VYVO
We write the relation as:
(AttribE, AttribA, AttribB, AttribC, AttribD, AttribF)
Another example: A Person keeps track of Race Horses and Dogs
A B C D E .
Horse 2 years $500,000 Mohawk Race Track Lightening Rod
GreyHound 3 years $125,000 Tampa Florida Speckled Speed
Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry
Greyhound 4 years $125,000 Hialia Atlantic City All Leggs
Answer:
Column A eliminated! Why?
Column B possible primary key! Why?
Column C eliminated! Why?
Column D possible primary key! Why?
Column E possible primary key! Why?
Columns that qualify as possible primary keys are called candidate keys.
The candidate keys are:
______
______
______
Of the above 3 candidate keys, which is the best selection as primary key?
______
2- Recognizing the primary key (two attributes)
We will change our example slightly so that our selection of primary key no longer works.
A B C D E .
Horse 2 years $500,000 Mohawk Race Track All Leggs
GreyHound 3 years $125,000 Tampa Florida Speckled Speed
Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry
Greyhound 4 years $145,000 Hialia Atlantic City All Leggs
To identify each row with a unique primary key, the animal's name is not enough.
It would be possible for a GreyHound to be called “All Leggs”, resulting in a duplicate Primary Key value.
The best way to get a unique row identifier (PK) is to use two attributes concatenated together.
List the possible answers: ______+______
______+______
______+______
______+______
Which two attributes are best? ______
These are the best choice for primary key because each PK will not be re-used in another row.
2- Recognizing the primary key - two attributes (continued)
Example of Hockey Awards
Award PlayerName PNumber Position Team Year
Best Defense Joe Wall 17 Left Defense Toronto 1999
Sy Stopp 7 Right Defense Detroit 2000
Pete Puck 22 Left Defense Montreal 2001
Joe Wall 17 Left Wing Toronto 2002
MostValuable Sam Scores 18 Center Chicago 1999
Wayne Gret 99 Center New York 2000
Joe Wall 17 Left Wing Toronto 2002
Answer:
1. Step 1: Fill in all attributes:
Award PlayerName PNumber Position Team Year
Best Defense Joe Wall 17 Left Defense Toronto 1999
Best Defense Sy Stopp 7 Right Defense Detroit 2000
Best Defense Pete Puck 22 Left Defense Montreal 2001
Best Defense Joe Wall 17 Left Wing Toronto 2002
MostValuable Sam Scores 18 Center Chicago 1999
MostValuable Wayne Gret 99 Center New York 2000
MostValuable Joe Wall 17 Left Wing Toronto 2002
2. Step 2: Look for any column that has no value that is used in more than one row. We are looking for a column for which it’s value is UNIQUE. We first check to see if we can have a single attribute primary key.
No one column meets this criterion.
3. Step 3: Look for any pairs of columns which when concatenated produce a unique value.
Award PlayerName PNumber Position Team Year
Team + Year…………Position + Team……..
Position + Year………
PNumber + Position….
PNumber + Team…….
PNumber + Year……..
/ PlayerName + PNumber …
PlayerName + Position……
PlayerName + Team………
PlayerName + Year……….
Award + PlayerName……..
Award + PNumber………...
Award + Position………….
Award + Team…………….
Award + Year……………..
Award PlayerName PNumber Position Team Year
Best Defense Joe Wall 17 Left Defense Toronto 1999
Best Defense Sy Stopp 7 Right Defense Detroit 2000
Best Defense Pete Puck 22 Left Defense Montreal 2001
Best Defense Joe Wall 17 Left Wing Toronto 2002
MostValuable Sam Scores 18 Center Chicago 1999
MostValuable Wayne Gret 99 Center New York 2000
MostValuable Joe Wall 17 Left Wing Toronto 2002
So Award + Year will be selected as the PK.
and the relation will be:
(Award, Year, PlayerName, PNumber, Position, Team)
3. Define Functional Dependence.
Functional dependence is a formal name for what is basically a simple idea. A column (attribute), B, is functionally dependent on another column, A (or possibly a collection of columns), if a value for A determines a single value for B at any one time.
From our previous example
Type Age Winnings HomeRaceTrack Name .
Horse 2 years $500,000 Mohawk Race Track Lightening Rod
GreyHound 3 years $125,000 Tampa Florida Speckled Speed
Horse 1 year $ 80,000 Blue Bonnets Track Slow Harry
Greyhound 4 years $125,000 Hialia Atlantic City All Leggs
In this example we had selected only the Name as PK. This means that: if we know the Name, we can know the exact value of the other attributes.
- Type is functionally dependant on Name. Name functionally determines Type.
- Age is functionally dependant on Name. Name functionally determines Age.
- Winnings functionally dependant on Name. Name functionally determines Winnings.
etc…
We can draw a picture of functional dependency:
Name / Type / Age / Winnings / Home Race TrackFunctional Dependency with a two part PK
(Award, Year, PlayerName, PNumber, Position, Team)
Award / Year / PlayerName / PNumber / Position / Team