23
Database [03. The Relational Model and Normalization] 2009년 2학기 [email protected] 1 Chapter 3. The Relational Model and Normalization 1. Introduction 2 2. Relational Model Terminology 3 4. Normal Forms 11 5. Multi-valued Dependency 21 6. The Fifth Normal Form 22

03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Embed Size (px)

Citation preview

Page 1: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 1

Chapter 3. The Relational Model and Normalization

1. Introduction 2

2. Relational Model Terminology 3

4. Normal Forms 11

5. Multi-valued Dependency 21

6. The Fifth Normal Form 22

Page 2: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 2

Chapter 3. The Relational Model and Normalization

1. Introduction

■ Questions ?

- Should we store these two tables as they are, or

- Should we combine them into one table in our new database?

■ We need to understand:

- The relational model

- Relational model terminology

Page 3: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 3

2. Relational Model Terminology

■ Relational Model

- Introduced in 1970

Created by E.F. Codd

- He was an IBM engineer

- The model used mathematics known as "relational algebra"

Now the standard model for commercial DBMS products

■ The most Important Terms used by the Relational Model

• Entity• Relation• Functional Dependency• Determinant• Candidate Key• Composite Key• Primary Key• Surrogate Key• Foreign Key• Referential integrity constraint• Normal Form• Multivalued Dependency

(1) Entity

An entity is some identifiable thing that users want to track:

- Customers

- Computers

- Sales

Page 4: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 4

(2) Relation

A Relation :

- Figure 3-5 : Tables that are not relations

(a) Table with Multiple entries per cell

(b) Table with Required Row order

- Alternative Terminology

Page 5: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 5

(3) Functional Dependency

■ A functional dependency occurs when the value of one (a set of) attribute(s)

determines the value of a second (set of) attribute(s):

StudentID → StudentName

StudentID → (DormName, DormRoom, Fee)

■ X → Y :

If the value of X determines the value of Y ,

attribute Y is functionally dependent on attribute X

■ Determinant

- The attribute on the left side of the functional dependency is called the

determinant

- Functional dependencies may be based on equations:

ExtendedPrice = Quantity X UnitPrice

(Quantity, UnitPrice) → ExtendedPrice

- Function dependencies are not equations!

■ Composite Determinant

- A determinant of a functional dependency that consists of more than one

attribute

(StudentName, ClassName) → (Grade)

■ Functional Dependency Rule

① If X → (Y, Z), then X → Y and X → Z

② But, If (X, Y) → Z, it is not true X → Y and Y → Z

neither X nor Y determines Z by itself

Page 6: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 6

■ Formal Description

Let R be a relation schema α ⊆R, β⊆R

The functional dependency α → β holds on R

if and only if

for any legal relation r(R),

whenever any two tuples t1 and t2 of r agree on the attributes α,

they also agree on the attributes on the β

That is, t1[α] = t2[α] ⇒ t1[β] = t2[β]

Example : Supplier Relation

S# → (SNAME, STATUS, CITY)

S# SNAME STATUS CITYS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens

FD diagram

S#

SNAME

STATUS

CITY

S#

SNAME

STATUS

CITY

■ Finding Functional Dependency in SKU_DATA Relation

SKU → (SKU_Description, Department, Buyer)

SKU_Description → (SKU, Department, Buyer)

Buyer → Department

Page 7: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 7

■ Finding Functional Dependency in Order_ITEM Relation

(OrderNumber, SKU) → (Quantity, Price, ExtendedPrice)

(Quantity, Price) → (ExtendedPrice)

SKU ↛ PRICE

(the price can be changed after an order has been processed)

■ What Are Determinant Values Unique ?

■ A determinant is unique in a relation if, and only if, it determines every

other column in the relation

■ You cannot find the determinants of all functional dependencies simply by

looking for unique values in one column:

- Data set limitations

- Must be logically a determinant

※ To determine if attribute A determines attribute B

"Every time that a value of attribute A appears is it mateched with the

same value of attribute B?"

- sample data can be incomplete

Page 8: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 8

■ Keys

■ Key

A group of one or more attributes that uniquely identifies a row

■ K is a super key for relation schema R if and only if

K →R

(uniqueness property)

■ K is candidate key for R if and only if

K → R and

there is no α ⊂ K, α → R

( minimality property)

■ A primary key is a candidate key selected as the primary means of

identifying rows in a relation:

- There is one and only one primary key per relation

- The primary key may be a composite key

■ There is at least one candidate key

- because relations do not contain duplicate tuples

- at least the combination of all attributes of the relation has the

uniqueness property

■ A surrogate key as an artificial column added to a relation to serve as a

primary key:

■ A foreign key is the primary key of one relation that is placed in another

relation to form a link between the relations:

- A foreign key can be a single column or a composite key

- The term refers to the fact that key values are foreign to the relation

in which they appear as foreign key values

DEPARTMENT (DepartmentName, BudgetCode, ManagerName)EMPLOYEE(EmployeeNumber,EmployeeName, DepartmentName)

Page 9: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 9

■ Definition of Foreign Key

■ Let R2 be a base relation.

■ Then foreign key in R2 is a subset of the set of attributes of R2, say FK,

such that

① There exists a base relation R1 with a candidate key CK, (R1 and R2 not

necessarily distinct)

② For all time, each value of FK in the current value of R2 is identical to

the value of CK in some tuple in the current value of R1

CK R2R1 FK

■ Referential Integrity Rule

The database must not contain any unmatched foreign key values

If B references A, A must exist

"foreign key" and "referential integrity" are defined in terms of each other

■ Database Designer

- specify which operations should be rejected

- specify which compensating operations should be performed

What should happen on an attempt to delete the target of a foreign key

reference ?

RESTRICTED : "restricted" to the case where there are no matching.

CASCADES : " cascades" to delete those matching also

Page 10: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 10

■ Nulls

- as a basis for dealing with the problem of missing information

- "date of birth unknown","to be announced"

- Nulls are not the same as blank or zero

three valued logic

AND True False unknownTrue True False unknownFalse

unknownFalse False

unknownFalse

Unknown False

OR True False unknownTrue True True TrueFalse

unknownTrue False

unknownunknown

True unknown

Entity Integrity Rule

■ No component of the primary key of a base relation is allowed to accept

nulls.

■ definition of every attribute involved in primary key must include the

specification NULLS NOT ALLOWED

Integrity Rules - Summary

Referential Integrity Rule

Entity Integrity Rule

Page 11: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 11

4. Normal Forms

■ Modification Anomalies

■ update anomaly

(before update)

(after update)

■ Deletion Anomaly

- To lose facts that tow different things, ( a machine and a repair)

- If you delete a tuple that has the item number 200 or 300

■ Insertion Anomaly

※ Primary Key :( OrderNumber, SKU)

- We assume that situation when we are going to insert the fact that related

with only the SKU ( SKU, SKU_Descripion, Department, Buyer)

- Is it possible ?

Page 12: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 12

■ Normalization Theory

All Possible Relational Schema

♦ Repetition of information♦ Inability to represent certain information♦ Inefficient retrieval and update performance

Good Database DesignGood Database Design

Normalization

■ Relations are categorized as a normal form based on which modification

anomalies or other problems that they are subject to:

Page 13: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 13

■ Normalization Category

■ 1NF - A table that qualifies as a relation is in 1NF

■ 2NF - A relation is in 2NF if all of its nonkey attributes are dependent on

all of the primary key

■ 3NF

- A relation is in 3NF if it is in 2NF and has no determinants except the

primary key

- The relation R is 3NF iff it is 2NF and has no transitive dependencies

■ Boyce-Codd Normal Form (BCNF) - A relation is in BCNF if every

determinant is a candidate key

■ Eliminating Anomalies from Functional Dependencies

Put all relations into Boyce-Codd Normal Form (BCNF):

Page 14: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 14

■ Eliminating Anomalies from Functional Dependencies : Example 1 (page 84)

- SKU_DATA relation

■ FDs

SKU → (SKU_Description, Department, Buyer)SKU_Description → (SKU, Department, Buyer)Buyer → Department

■ SKU and SKU_Description

; all of the attributes in the table, so they are candidate keys

■ Buyer

; It does not determine all of the other attributes

★ Hence, the SKU_DATA relation is not in BCNF

Step 3-A in Figure 3-11 :

Step 3-B : make the Buyer attribute the primary key

BUYER(Buyer, Department)

Step 3-C and 3-D : SKU_DATA_2(SKU, SKU_Description, Buyer)

Page 15: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 15

[Report #3]

■ Describe the whole process of below examples and write your explanation

and PKs, FKs of the decomposed relations for each step of process for

putting a relation into BCNF.

1) Eliminating Anomalies from Functional Dependencies : Example 2 (page 85)

2) Eliminating Anomalies from Functional Dependencies : Example 3 (page 86)

3) Eliminating Anomalies from Functional Dependencies : Example 4 (page 87)

4) Eliminating Anomalies from Functional Dependencies : Example 5 (page 89)

Page 16: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 16

■ Theoretical Approach : The First Normal Form

■ definition

- Any table of data that meets the definition of a relation

■ Formal Definition

A relation is in 1NF iff all underlying domains contain scalar values only

Example

"상태 20인 도시 “런던”에 위치한 공급자 S1이 부품 P1을 300개 공급한다“

20

S# STATUS CITYS1 20 LondonS1 London

P#P1P2

QTY300200

S1S1S2S2S3

2020101010

ParisParisParis

LondonLondon

P3P4P1P2P2

400200300400200

S#

P#

QTY

CITY

STATUS

S5 40 Athens P4 100

Anomalies in 1NF

③ Insertion Anomaly

New Supplier without P#

④ Deletion Anomaly

Unexpected loss of bundled information

⑤ Update Anomaly

London → Paris

20

S# STATUS CITYS1 20 LondonS1 London

FIRST

P#P1P2

QTY300200

S1S1S2S2

20201010

ParisParis

LondonLondon

P3P4P1P2

400200300400

S5 40 Athens P4 100S3 90 New York P2 200

redundancy

S6 70 Florence Null Null

Page 17: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 17

Solution

■ Decomposition

First ⇒ SECOND(S#, STATUS, CITY), SP(S#, P#, QTY)

Nonloss(lossless decomposition)

- Reversibility

S# STATUS CITYS3 30 ParisS5 30 Athens

S# STATUSS3 30S5 30

STATUS CITY30 Paris30 Athens

S

STC

S# STATUSS3 30S5 30

S# CITYS3 ParisS5 Athens

SCnonloss

lossy

test for the lossless decomposition

S# STATUSS3 30S5 30

S# CITYS3 ParisS5 Athens

(a) SST SC S# STATUS CITYS3 30 ParisS5 30 Athens

S

S# STATUSS3 30S5 30

STATUS CITY30 Paris30 Athens

(b) SST STC S# STATUS CITYS3 30 Paris

S5 30 Athens

S

S3 30 AthensS5 30 Paris

nonloss

lossy

Page 18: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 18

Decomposition : In a view of functional dependency

S# STATUS CITYS3 30 ParisS5 30 Athens

S# STATUSS3 30S5 30

STATUS CITY30 Paris30 Athens

S

(b) SST STC

S# STATUSS3 30S5 30

S# CITYS3 ParisS5 Athens

(a) SST SC

S# → STATUSS# → CITY

S# → STATUSS# → CITY

S# → STATUSS# → CITY

S# → STATUSS# → CITY

S# → STATUSS# → STATUS

S# → CITYis lostWe cannot tell which

supplier has which city

Heath's Theorem

Let R(A, B, C) be a Relation, where A, B, C are set of attributes,

- If R satisfies the FD A→ B,

- then R is equal to join of its projections on R(A, B) and R(B, C)

First ⇒ SECOND(S#, STATUS, CITY), SP(S#, P#, QTY)

S#

STATUS CITY

S1

20 London

S1SECOND

P#P1P2

QTY300200

S1S1S2S2S3

1010

ParisParis

P3P4P1P2P2

400200300400200

S#S1S2S3

SP

40 AthensS5

S5 P4 100

Page 19: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 19

■ Theoretical Approach : The Second Normal Form

Definition

- R is in 2NF When all of a relation's nonkey attributes are dependent of a

key

Formal Definition

A relation R is in 2NF if and only if

- 1NF

- Every non-key attribute is irreducibly(fully) dependent on the primary

key

■ Anomaly in 2NF

Insertion Anomaly

- Insertion {Florence, Italy}

Deletion Anomaly

- Deletion of tuple S1 : Unexpected loss of bundled information

Update Anomaly

STATUS CITY20 London

SECOND

1010

ParisParis

S#S1S2S3

40 AthensS5

S# CITY

STATUS

Because, Transitive Dependency

■ Decomposition

SECOND(S#, STATUS, CITY) ⇒decomposition

SC(S#, STATUS), CS(CITY, STATUS)

S# STATUSCITYS1 London

SC

S2S3

3020Paris

ParisS5 Athens

CITY

LondonParisRome

Athens

1050

CS

Page 20: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 20

■ Theoretical Approach : The Third Normal Form

■ Definition

The relation R is 3NF iff it is 2NF and has no transitive dependencies

■ Formal Definition

- 2NF

- Every non-key attribute is non-transitively(no mutual dependency) on the

primary key

■ Anomaly in 3NF

S

Smith

Smith

Jones

Jones

J

Math

Math

Physics

Physics

T

Prof. White

Prof. White

Prof. Green

Prof. Brown

S

JT

3NF

Two candidate keys :{S,J} , {S, T}

Semantic constraintsFor each subject, each student of that subject is taught by onlyone teacher { S, J } → TEach teacher teaches only one subject T → JEach subject is taught by several teachers

AnomalyJones drops the “physics”delete the last tuple we loss the information “Prof. Brown teaches Physic”

■ Solution : Decomposition

- SJT ⇒ ST(S, T), TJ((T, J)

- BCNF(Boyce-Codd Normal Form)

Page 21: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 21

5. Multi-valued Dependency

■ A multi-valued dependency occurs when a determinant determines a

particular set of values:

Employee ↠ Degree

Employee ↠ Sibling

PartKit ↠ Part

■ The determinant of a multivalued dependency can never be a primary key

■ Multivalued dependencies are not a problem if they are in a separate relation,

so:

- Always put multivaled dependencies into their own relation

- This is known as Fourth Normal Form (4NF)

Page 22: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 22

A relation is in 4NF iff, for all multi-valued dependency A →→B, non-key attributes are dependent on A

C

database

database

B

Korth

Ullman

T

Prof. White

Prof. White

database

database

Korth

Ullman

Prof. Green

Prof. Green

databaseProf. White

Prof. Green

Korth

Ullman

database UllmanProf. Sara

database KorthProf. Sara

insertion

C →→ T C →→ B

6. The Fifth Normal Form

A relation is in 5NF iff , for all join dependency (R1, R2, ..., Rn),

R1, R2 , … , Rn is a candidate keyS#S1S1

P#P1P2

J#J2J1

S#S1S1

P#P1P2

P#P1P2

J#J2J1

S#S1S1

J#J2J1

S#S1S1

P#P1P2

P#P1P2

J#J2J1

S#S1S1

J#J2J1

S2 P1 P1 J1 S2 J1

S2 P1 J1

S1 P1 J1

S1 P1 J1

Page 23: 03. Relational Model and Normalizationdblab.sangji.ac.kr/downloads/2014hanb_DB/chap3.pdf · • Foreign Key • Referential ... ■ Boyce-Codd Normal Form (BCNF) - A relation is in

Database [03. The Relational Model and Normalization] 2009년 2학기

[email protected] 23

[Report #4]

1. make the relation that is the result of join on the relation CUSTOMER and

ORDER.

CUSTOMER (Cid, Name, Address)

Cid → (Name, Address, zipcode)Address → Zipcode

ORDER (BookNumber, Cid, Price, Orderdate)

(BookNumber, Cid) → (Price, Orderdate)

2. Write a detail process the following problems.

(1) Describe the FDs in result relation 1 and illustrate the anomaly

(2) Transform into the second Normal Form

(3) Transform into the third Normal Form

(4) Transform into the BCNF

(5) make sure that result of (4) is the same as the result of problem 1