25
Yunsheng Liu Software College, HUST Software College, HUST 2007. 11 2007. 11

Yunsheng Liu

  • Upload
    jalena

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

V. Data Storage Management. Yunsheng Liu. Software College, HUST 2007. 11. Data satisfying request. CPU. Request for data. Cache. Primary storage. EEPROM. Main Memory. Fresh Memory. Magnetic Disk. Secondary storage. CD-ROM. Optical Disk. Magnetic Tape. Tertiary storage. - PowerPoint PPT Presentation

Citation preview

Page 1: Yunsheng Liu

Yunsheng LiuYunsheng Liu

Software College, HUSTSoftware College, HUST2007. 112007. 11

Page 2: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

22

5.1 The Memory Hierarchy5.1 The Memory Hierarchy

CPU

Cache

Main Memory

Magnetic Disk

Magnetic Tape

Request for data

Data satisfying request

Tertiary storage

Secondary storage

Primary storage

5.1.1 The Storage Levels5.1.1 The Storage Levels

Fresh Memory

Optical Disk

EEPROM

CD-ROM

Page 3: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

33

5.1.2 Disk Property5.1.2 Disk Property— electrical erasable

TracksTracks

BlockBlock

CylinderCylinder

PlatterPlatter

SectorSector

GapsGaps

Disk headDisk headDisk armDisk armSpindleSpindle

Page 4: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

44

2. Performance property of disks

1). Data must be in main memory for the DBMS to

operate on it

2). The unit for data transfer between main memory

and disk is a block.

R/W a disk block is called an I/O

3). Block access time—from when an R/W is issued to

when the block appears in MM:

access time=seek time+rotational delay+transfer time

5.1.2 Disk Property5.1.2 Disk Property

Page 5: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

55

4. Optimization of disk-block access File organization Scheduling (Disk-arm, i.e. I/O) Nonvolatile RAM for writing: Battery-backed-up RAM Log disk—devoted to writing log in much the same way as non-V RAM. Log-based file system

Capacity Access latency( seek time + rotational latency time) Data transfer rate

3. Performance measures of disks

5.1.2 Disk Property5.1.2 Disk Property

Page 6: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

66

5.1.3 RAID Concept

RAID—Redundant Arrays of Independent(historically Inexpensive) DisksA Disk array—an arrangement of several disks, organized so as to

Increase performance—data striping Improve reliability—redundancy

RAID Levels Level 0: Nonredundant striping Level 1: Mirrored disksLevel 2: Error-correcting code (ECC)

Page 7: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

77

5.2 Stored Data Management5.2 Stored Data Management

5.2.1 Introduction 1. Stored Data Kinds: User database Data Dictionary/ Directory, Log 2. Stored Data Structures: Arrangement: sequential, random Connection: address adjacent, chaining 3. Access modes: sequential, indexed, hashing 4. I/O Buffer management 5. Interface to OS

Page 8: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

88

5.2.2 Storage Management Structures

2. Physical structure Stored structure: stored file stored record stored item Device structure: device volume cylinder

track physical record/sector

1. Logical structure: Logical file page record field

3. Allocation structure Extent block

5.2 Stored Data Management5.2 Stored Data Management

Page 9: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

99

4. Mapping From Logical Structure to Physical Structure4. Mapping From Logical Structure to Physical Structure

Volume

Sector

Logical structureLogical structure

Stored structureStored structure

Allocation structureAllocation structure

Physical structurePhysical structure

Logic File

PagePage

Logic RecordLogic Record

FieldField

Stored File

Stored RecordStored Record

Stored ItemStored Item

Cylinder

Track

BlockBlock

Extent

5.2 Stored Data Management5.2 Stored Data Management

Page 10: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1010

5.2.3 Overview of File Organizations 1. Stored Data Arrangement 2. Access Modes

File Org.File Org.

Sequential FileSequential File

Random FileRandom File

Heap FileHeap File

Sorted FileSorted File

Indexed FileIndexed FileGeneral Index FileGeneral Index File

Hash FileHash File

Tree Index FileTree Index File

B+-Tree B+-Tree

B-Tree B-Tree

Static Hash Static Hash

Dynamic Hash Dynamic Hash

5.2 Stored Data Management5.2 Stored Data Management

Page 11: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1111

3. Classification of File Organizations3. Classification of File Organizations

Adjacent ChainedSequential

Indexed sequential

Tree-structural

Static hash

Dynamic hash

Sequential processing

Random processing

Storage structure

Heap Sorted Indexed HashedAccess mode

Chain

5.2 Stored Data Management5.2 Stored Data Management

Page 12: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1212

- How to organize blocks/pages in a file to support to create, destroy a file, and get, insert, delete a record and scan all records in the file

Conjunctive arrangement of blocks

Problems: how to insert, delete? how many free slots/pages?

File Head Free blocks

Data block 1 Data block 2 Data block N

Frame 1 Frame 2 Frame N Frame N+m

• • •Fid P

5.3 Sequential File Structure5.3 Sequential File Structure

Page 13: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1313

Student S# SName SAge Dept

200103001 李红光 23 SW200405840 何清溪 19 MS200203123 刘要武 20 CS200101015 李 光 22 EE200203101 刘 民 20 CS200305103 张一清 21 MS200403123 张扬名 18 SW200201123 王克勤 21 EE

(a). Natural Sequence Structure

Student S# SName SAge Dept

200101015 李 光 22 EE 200103001 李红光 23 SW200201123 王克勤 21 EE200203101 刘 民 20 CS200203123 刘要武 20 CS200305103 张一清 21 MS200403123 张扬名 18 SW200405840 何清溪 19 MS

(b). Ordered Sequence Structure

Example

5.3 Sequential File Structure5.3 Sequential File Structure

Page 14: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1414

The space for pointers of the chainsVirtually, the full list will be empty in variable record

5.4 Chained List File Structure5.4 Chained List File Structure

Data page Data page Data page Fid P

File Header

(a). Hybrid Chain

Data block

(b) Separated Chain

File HeaderFile Header

P1

P2Fid

Data block Data block

Data block Data block Data block

Page 15: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1515

5.5 Index Structures 5.5 Index Structures

5.5.1 Overview of Indexes5.5.1 Overview of Indexes 1. Concepts1. Concepts

An index is an auxiliary data structure that is intended

to help us find Rids of records with given search key

value An index is a file/collection of records, referred as

index entries, which are usually pairs (k, Rid) and

Rid is a pointer to a record with search key value k An index is a mechanism of KTA(Key to Address)

Page 16: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1616

2. Generic index structure

Indexing on SK

k

Index entries

Search key

The records with the value k of SK

ridiki

ridrkr

ridjkj

Index

kj

ki

kr

SK

Data File

Domain of SK

5.5.1 Overview of Indexes5.5.1 Overview of Indexes

Page 17: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1717

3. Index file organizations

How to organize index entries to support rapid retrieval of entries with a given search key value? e.g.

Sequential indexes Various tree-structural indexes, Hash-based indexes—Scatter Table

5.5.1 Overview of Indexes5.5.1 Overview of Indexes

Page 18: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1818

5.5.2 Properties of Indexes

1. Clustered vs. Unclustered

Clustered — the ordering of data records is the same

as (or close to ) the ordering of index entries

- The two orderings are matched with each other

Unclusterd — not match with each other

2. Dense vs. Sparse

Dense: an index entryindividual data record

Sparse: an index entrya set (usually, a block/page)

of data records

Page 19: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

1919

3. Primary vs. Secondary

Primary index Primary key

Secondary index Candidate/Secondary key

4. Simple vs. Composite Key

Composite key more than one fields

Simple key single field

5.5.2 Properties of Indexes

Page 20: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2020

5.6 B-Tree Structured Indices5.6 B-Tree Structured Indices

1. Nonleaf Node structure

Root Node

Inner Nodes

- -… …

…PnPn-1• • • Kn rnP2K2 r2P1K1 r1P0

R(K1) R(K2)

Data Records

Page 21: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2121

rmKm…r2K2r1K1 …

R(K1)R(K1) R(K2)

R(K2) …

2. Leaf Node Structure

3. A B-tree structure

--1010 2020 --6060 7070

--5050 ……

33 55 - - 5252 5454 5555 - - 7474 7878 - -6161 6363 6565 69691111 1313 1414 - -2222 2323 - -

R(50)R(50)

R(20)R(20)R(10)R(10) R(60)R(60) R(70)R(70)

5.6 B-Tree Structured Indices5.6 B-Tree Structured Indices

Page 22: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2222

PnPn-1• • • Kn P1K1P0 P2K2

rmKm…r2K2r1K1 …

1. Nonleaf Node structure

2. Leaf Node Structure

5.7 B+-Tree Structured Indices5.7 B+-Tree Structured Indices

Page 23: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2323

Index Set

Sequence Set

Record Set

B+-tree

Data File

Random access

Sequential access

3. A B+-tree structure

5.7 B+-Tree Structured Indices5.7 B+-Tree Structured Indices

Page 24: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2424

B0

B1

Bn-1

• • •

h(k i)ki

Hash Function

Major Data Area Overflow Area

Record slot

Block• • •

1. General Hashing Structure

5.8 Hashing File Structures 5.8 Hashing File Structures

Page 25: Yunsheng Liu

Yunsheng Liu Yunsheng Liu

2525

5.8 Hashing File Structures 5.8 Hashing File Structures

2. Bucket Hashing

Primary blocks

Hashing Function

Key

KTA Transformation Buckets

Bu1

Bu2

Bun

• • •• • •

• • •