Upload
winfred-stanley
View
217
Download
1
Embed Size (px)
Citation preview
File Structure SNU-OOPSLA Lab 1
Appendix A : Appendix A : Designing File Structures Designing File Structures for CD-ROM & DVD for CD-ROM & DVD
서울대학교 컴퓨터공학부객체지향시스템연구실SNU-OOPSLA-LAB
교수 김 형 주
File Structures by Folk, Zoellick, and Ricarrdi
File Structure SNU-OOPSLA Lab 2
ObjectivesObjectives
Show how to apply good file structure design principles to develop solutions that are appropriateto this new medium
Describe the directory structure of the CD-ROM file system and show how it grows from the characteristics of the medium
File Structure SNU-OOPSLA Lab 3
Appendix OutlineAppendix Outline
A.1 Using This Appendix
A.2 Tree Structures on CD-ROM
A.3 Hashed Files on CD-ROM
A.4 The CD-ROM File System
A.5 Summary
File Structure SNU-OOPSLA Lab 4
Purpose of this AppendixPurpose of this Appendix
Purpose to use the problem of designing file structures for CD-ROM
In this appendix provide a high-level look at how the performance of
CD-ROM affects the design of tree structures hashed indexes directory structures
A.1 Using This Appendix
File Structure SNU-OOPSLA Lab 5
Tree Structures on CD-ROMTree Structures on CD-ROM
Tree structure are a good way to organize indexes and data on CD-ROM
Avoid seeks is the key strategy in CD-ROM file structure design Sector size is 2k in most CD-ROM Sequential reading performance is moderately fast, for example,
seek of 8k block is relatively efficient than 2k block Block size
Sector size = smallest addressable unit Block size = sector size * constant The large tree structure should usually use at least an 8-Kbytes
block
A.2 Tree Structures on CD-ROM
File Structure SNU-OOPSLA Lab 6
Tree Structures on CD-ROMTree Structures on CD-ROM: Special Loading Procedures: Special Loading Procedures
Special loading procedures and other consideration B+ tree is commonly used in CD-ROM application
because B+ tree Provide both indexed and sequential access to records Provide very shallow, broad indexes to a set of sequential records Easy to build a two-level index above the sequence set with a
separate loading procedure that builds the tree from bottom up
A.2 Tree Structures on CD-ROM
File Structure SNU-OOPSLA Lab 7
Tree Structures on CD-ROMTree Structures on CD-ROM: Packing Index: Packing Index
The importance of packing index CD-ROM has relatively large storage,but, sometimes,
we are running out of space because of large documents, images, etc
We need to pack index 100%- full loading procedure
Bottom up organization can pack tree fully
A.2 Tree Structures on CD-ROM
File Structure SNU-OOPSLA Lab 8
Tree Structures on CD-ROMTree Structures on CD-ROM: Virtual Tree and Secondary Index: Virtual Tree and Secondary Index
Virtual trees and buffering blocks The root of tree should always be buffered in RAM in order
to reduce seek time Buffering below root node should reduce seek time Buffering is most useful when successive accesses to the
tree tend to be clustered in one area Trees as secondary indexes on CD-ROM
CD-ROM applications provide more than one access route to the data on the disc
Secondary index should bind to the target records as tightly as possible
A.2 Tree Structures on CD-ROM
File Structure SNU-OOPSLA Lab 9
Update of Tree Structures on CD-ROMUpdate of Tree Structures on CD-ROM
Tree structure of CD-ROM will not be updated? One objection is that quite frequently CD-ROM is
reorganized between successive editions of disc Several approaches for this objection
1. Maintain loosely bound records in the source database and transforming them to tightly bound records for publication on CD-ROM
2. Trade off performance on the published disc for decreased costs in producing it
A.2 Tree Structures on CD-ROM
File Structure SNU-OOPSLA Lab 10
Hashed Files on CD-ROMHashed Files on CD-ROM
Hashing is an excellent way to organize indexes on CD-ROM by its single access retrieval
We should avoid overflow Bucket size
Should be multiple of 2 Kbytes If bucket size < sector size, it would be counterproductive How many sectors into a bucket?
trade-off between seeking and sequential reading Larger buckets require more searching and sequential reading to
find the record in buffer
A.3 Hashed Files on CD-ROM
File Structure SNU-OOPSLA Lab 11
Packing of Hashed Files on CD-ROMPacking of Hashed Files on CD-ROM
Packing of hashed file should avoid overflow Packing loosely, it causes additional seeking Moderated bucket size
Keep below 60% avoid almost of overflow Packing density of 60% and bucket size of 10
reduce overflow to 1.3 %
reduce the average number of seeks to 1.01
A.3 Hashed Files on CD-ROM
File Structure SNU-OOPSLA Lab 12
Advantages of Hashed Files on CD-ROMAdvantages of Hashed Files on CD-ROM
Advantages of CD-ROM's read-only status Packing density of index could be up to 100 % We have all keys that are to be hashed at hand We can choose a hash function that provides the
performance we need
A.3 Hashed Files on CD-ROM
File Structure SNU-OOPSLA Lab 13
CD-ROM File SystemCD-ROM File System
The design goals Support hierarchical directory structures Find and open any one of thousands of files with only
one or two seeks Support the use of generic file names as in “file*.c”
and subdirectory accesses
A.4 The CD-ROM File System
File Structure SNU-OOPSLA Lab 14
Two Approaches of CD-ROM File SystemTwo Approaches of CD-ROM File System
Two approaches that were commercially available and tailored to CD-ROM
1. Left-child right sibling tree (Figure A.2) Places the entire directory structure in a single file Works well if the directory structure is small Becomes poor if directory structure is large
2. Hashed index structure(Figure A.3) Creates an index to the file locations by hashing the full
path names of each file Works well if single file is accessed Does very poor job supporting generic file name such as
“file*.c” or directory listing command such as “ls” or “dir”
A.4 The CD-ROM File System
File Structure SNU-OOPSLA Lab 15
Hybrid Design of CD-ROM File SystemHybrid Design of CD-ROM File System
Hybrid design - conventional directory structure + hashed index Provide single-seek access to any file Provide the ability to work using generic file names and
commands such as “ls” or “dir” subdirectory retrieval = all file hash processing Approach
Build conventional directory structure
(use a file for each directory) Build index for the subdirectories to solve access problem
A.4 The CD-ROM File System
File Structure SNU-OOPSLA Lab 16
Extensions of CD-ROM File SystemExtensions of CD-ROM File System
CD-ROM committee settled on an approach that went one step further They decided to use a special index that take advantage
of that hierarchy of the subdirectories The directory are ordered in the index, parent are
appeared before their children Each child associated with an integer that is a backward
reference to the relative record number(RRN) of the parents
It is good example of a specialized index structure that make use of the hierarchical structure of subdirectories
A.4 The CD-ROM File System
File Structure SNU-OOPSLA Lab 17
RRN Parent
0 Root -1
1 Reports 0
2 Letters 0
3 School 1
4 Work 1
5 Personal 2
6 Work 2
Figure A.4 Path Index table of directories
File Structure SNU-OOPSLA Lab 18
SummarySummary CD-ROM is an electronic publishing medium CD-ROM is built on top of the CD audio The primary disadvantage of CD-ROM is poor seek performance B-tree, B+ tree structures work well on CD-ROM because of their ability
to provide access to many keys with just few seeks Sector size of CD-ROM is 2Kbytes, block size should be multiple of
2Kbytes Larger block is usually advantageous in a tree We had better build index in bottom up fashion because there is no
update, and high index density Hashed index is good choice for CD-ROM because of single seek access CD-ROM file system are directory structure or hashed index Hybrid file system has advantages of two approaches
A.5 Summary
File Structure SNU-OOPSLA Lab 19
Let’s Review !!!Let’s Review !!!
A.1 Using This Appendix
A.2 Tree Structures on CD-ROM
A.3 Hashed Files on CD-ROM
A.4 The CD-ROM File System
A.5 Summary
File Structure SNU-OOPSLA-Lab. 20
Introducing “DVD Technology”Introducing “DVD Technology”
서울대학교 컴퓨터 공학부
SNU OOPSLA Lab.
교수 김 형 주
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 21
Contents• What is DVD?• History from CD to DVD• DVD Capacity• CD v.s. DVD• File Structure
• Track Structure• Sector Structure• Block Structure• Track Buffer Structure• Tree Structure• Hashed Files
• DVD Video
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 22
What is DVD?What is DVD? DVD
Digital Video disk (DVD-Video) Digital Versatile disk (DVD-ROM)
In september 1995 As a movie-playback format As a computer-ROM format
Next-Generation optical disc storage tech. will replace audio-CD,videotape,laserdisk, CD-ROM,etc.
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 23
The history from CD to DVDThe history from CD to DVD
1980, Sony & Philips --> CD-Audio 1985, Sony & Philips --> CD-ROM 1989, Sony & Philips --> CD-I 1990, Sony & Philips --> CD-R 1995, --> CD-E 1995, september --> DVD
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 24
DVD CapacityDVD Capacity
Single-sided DVD5 ( 4.7 GB/single-layer ) DVD9 ( 8.5 GB/dual-layer )
Double-sided DVD10 ( 9.4 = 4.7x2 GB/dual-layer ) DVD18 ( 17 = 8.5x2 GB/dual-layer )
Write-Once DVD-R ( 3.8 GB/side )
Overwrite DVD-RAM ( more than 2.6 GB/side )
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 25
0.6mm
0.6mm
0.6mm
0.6mm
reflexive-layer substrate
semi-transmissive-layer reflexive-layer
(a)
(b)
(a) Single sided, single layer (b) Single sided, dual layer
(gold-layer) (silver-layer)
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 26
CD vs. DVDCD vs. DVD
Laser-Beam CD --> infrared light ( 780nm ) DVD --> red light ( 635-650nm )
Capacity CD --> maximum 680MB DVD --> maximum 17GB( 25 times of CD )
Reference Speed CD --> 1.2m/sec. CLV DVD --> 4.0m/sec. CLV
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 27
Track StructureTrack Structure
Legend I Lead-in area (leader space near edge of disc)
D Data area (contains actual data)
O Lead-out area(leader space near edge of disc)
X Unusable area (edge or donut hole)
M Middle area (interlayer lead-in/out)
B Dummy-bonded layer
(to make disc 1.2mm thick instead of 0.6mm)
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 28
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBXX I I I DDDDDDDDDDDDDDDDDDDDDOOOXX
Single layer disc :
direction: continuous spiral from inside to outside of disc.
reference axis outer edge of disc
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 29
Dual layer disc :
(A) Parallel track path (for computer CD-ROM use) Direction : same for both layers.
(B) Opposite track path (for movies) Direction : opposite directions (Since the reference beam and angular velocities are the same at the layer transition point, the delay comes from refocusing.
This permits seamless transition for movie playback.)
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 30
XX I I I DDDDDDDDDDDDDDDDDDDDDOOOXX layer 1 XX I I I DDDDDDDDDDDDDDDDDDDDDOOOXX layer 0
XX I I I DDDDDDDDDDDDDDDDDDDDDOOOXX layer 1 XX I I I DDDDDDDDDDDDDDDDDDDDDOOOXX layer 0
reference axis outer edge of disc
(A)
(B)
(A) parallel track-path (B) Opposite track-path
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 31
Sector StructureSector Structure
2064 bytes/sector organized into 12 rows, each with 172bytes first row starts with 12B sector header (ID,IEC,Reserved bytes) final row is punctuated with 4B (EDC bytes)
172 x 12 = 2064 bytes/sector 12 rows
172bytes/rows
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 32
Row Fields within row
0 ID(4B) IEC(2B) RESERVED(6B) Main data(160B : D[0]-D[159] )1 Main data( 172B : D[ 160]-D[ 331] ) 2 Main data( 172B : D[ 332]-D[ 503] )3 Main data( 172B : D[ 504]-D[ 675] )4 Main data( 172B : D[ 676]-D[ 847] )5 Main data( 172B : D[ 848]-D[1019] )6 Main data( 172B : D[1020]-D[1191] )7 Main data( 172B : D[1192]-D[1363] )8 Main data( 172B : D[1364]-D[1535] )9 Main data( 172B : D[1536]-D[1707] )10 Main data( 172B : D[1708]-D[1879] )11 Main data( 168B : D[1880]-D[2047] ) EDC(4B)
ID : Identification Data ( 32bit sector number)IEC : ID Error CorrectionEDC : Error Detection Code
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 33
Block StructureBlock Structure
To combat burst error, 16 sectors are interleaved together ( 16 sectors * 12 rows/sector = 192 rows )
Error correction byes are concatenated 10bytes at the end of each row 16 rows at the end of the block
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 34
172bytes 10B
192 rows
16rows
Data Block
Error correction byes
payload/block = 172 x 192
182 x 208x 100 = 87 %
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 35
Track BufferTrack Buffer
The size of the track buffer is left to implementation, although the minimum recommended size is 2 MB
( Track buffer > Tmax * VBRmax = 0.104 sec * 10.08 MB/sec = 1.04832MB )
Tmax : max latency of one disc evolution VBRmax : max mux rate for any program
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 36
( Input stream to Track Buffer )
[n-2][n-1][n] .......... track jump ....... [m][m+1][m+2]
T
( no data transfer during discontinuity )
( Output stream from Track Buffer )
[n-2][n-1][n][m][m+1][m+2]
( no apparent discontinuity )
( Initial buffer delay introduced by track buffer )
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 37
Tree StructureTree Structure
Tree Structure are a good way to organize indexes & data on DVD
Design Issue Block size of B-tree & B+ tree Memory size of buffering blocks Loading procedure of B+ tree implementation Access mechanism of primary index & secondary index
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 38
Tree Structure(2)Tree Structure(2)
Block Size if block size is big, then can provide access to a large number of record in only
a few seeks. >= sector size DVD-ROM’s sequential reading performance is moderately fast than seeking
performance, so it is benefit to use a block composed of several sectors.
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 39
Tree Structure(3)Tree Structure(3) B+ tree
provides both indexed & sequential access to records provides very shallow, broad indexed to a set of sequenced records. provides access to millions of records with an index that is only two levels deep. if the root of index is kept in RAM, then the cost of searching is reduced to a
single seek. 100% - full loading procedure can be designed using bottom-up
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 40
Hashed FilesHashed Files
Hashing is an excellent way to organize indexes on DVD-ROM with a single access retrieval.
Design Issue Bucket Size Packing density for the hashed index Hash Function
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 41
Hashed Files(2)Hashed Files(2)
Bucket Size >= sector size multiple of sector size trade-off between seeking & sequential reading keeping the packing density below 60% will tend to avoid overflow almost all
the time. Hash Function
Hash function-fitting effort is worthwhile because of the asymmetric nature of writing & reading DVD-ROM
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 42
DVD Video FeaturesDVD Video Features
Over 2 hours of high-quality digital video (over 8 on a DS,DL disc) Support wide screen movies & standard or widescreen TVs ( 4:3 & 16:9 aspect ratio
s ) Up to 8 tracks of digital audio Up to 32 subtitle/karaoke tracks Up to 9 camera angles Multilingual identifying text for title name, album name, song name, actors, etc.
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 43
DVD Video Encoding DataDVD Video Encoding Data
Encoding Image MPEG-2 compression ( developed by the Motion Pictures Experts Group ) High-Resolution ( better than CD,LD 3-times better than Video tape )
Encoding Sound Dolby Digital surround AC-3 sound compression ( support five sound channel plus subwoofer channel => left, center, right, rear-left, rear-right channel )
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 44
DVD Video CardDVD Video Card
Encoding Data
MPEG-2( Image )
AC-3 ( sound )
Decoder
Decoding Data
Introducing "DVD Tech."File Structure
SNU-OOPSLA-Lab. 45
Let’s Review !!!• What is DVD?• History from CD to DVD• DVD Capacity• CD v.s. DVD• File Structure
• Track Structure• Sector Structure• Block Structure• Track Buffer Structure• Tree Structure• Hashed Files
• DVD Video