Loading File into Memory

Preview:

DESCRIPTION

제 07 강 : Loading File into Memory. Loading File into Memory. DMA buffer replacement LRU. cpu. Memory. buffer. DMA. sector. Buffer each buffer -- holds one disk block (sector) kernel has N buffers -- shared by all OS needs information about each buffer - PowerPoint PPT Presentation

Citation preview

1

Loading File into Memory

DMA buffer replacement LRU

제 07 강 : Loading File into Memory

2

• Buffer• each buffer -- holds one disk block (sector)• kernel has N buffers -- shared by all

– OS needs information about each buffer• user Clinton, Bob, ... (who’s using this buffer now)

• hw device, sector number• state free/used (empty/waiting/reading/full/locked/writing/dirty)

• “buffer header” (struct)• stores all information about each buffer• points to actual buffer• buffer header has link fields (doubly linked)

– device_list, free_buffer_list, I/O_wait_list

sector

DMA

buffer

cpu

Memory

3

“Buffer Cache”• Managed like CPU cache

– read ahead (reada)– delayed write (dwrite)

• dwrite– just set “dirty* bit” in buffer cache (on update)– write to disk later (when it is being replaced)

• reada– prefetch if offset moves sequentially

• dirty: data came from disk. Later memory copy is modified. Now disk copy and memory copy are different

sector

DMA

buffer

cpu

Memory

4

Delayed Write ---- Pros & cons

• Good performance– many disk traffic can be saved

• Complex reliability– logically single information – physically many copies (disk, buffer) --

inconsistency– If system crashes ...

sector

DMA

buffer

cpu

Memory

5

Pow

er

t

(2) computer full stop

(1) problem detected

6

Pow

er

t

problemdetected& interrupt

computer full stop

How many disk blocks can you save during this interval?

Emergency actionduring this period

7

Crash ...• Only few blocks can be saved• What happens if they cannot be saved…?

if lost, following goes wrongsuperblock which block is free/occupied?

inode pointer to file data block

data block if directories -- subtree structure

if regular files -- just a file content

• metadata are more important – superblock, directory, inode

8

Damage --- if this block becomes bad block?

Superblock

root

directory

inode

data

HolesOccupied

9

Crash ...• In program, sync(2) system call

– sync(2) flush (disk write) dirty buffers • doesn’t finish disk I/O (just queue them) on return• So sync(2) twice …2nd return guarantees flush

• At keyboard– updated calls sync(2) every 30 second -- periodic– halt(8), shutdown(8) calls sync(2) -- by super user try man 8 intro …. (before logoff)

• Caution– Do not power down without sync(2) or halt(8)– Otherwise the system crashes. What if it crashes?

10

fsck(8)• file system check -- check & repair file system

– performed at system bootup time– start from root inode -- mark all occupied blocks– start from superblock -- mark all free blocks– something is wrong if:

• some block has no incoming arc (unreachable)• some block has many incoming arc (reached many times)• lost+found

– Very time-consuming10 ms. * (1 GB / 1 KB) = 10 mega ms. = 10,000 sec !!!

11

Design Goal• Original UNIX file system design was

– cheap, good performance – adequate reliability for School, SW house

• on power fault ( 電源 中斷 )– max. 30 seconds’ amount of work is gone– most important metadata are saved

– timesharing market (school, sw house)

• UNIX for bank?– Need to solve these problems

flush

30 sec 30 sec

Power Down?

SomeContents lost

12

Modern systems• System V

– To reduce boot time (minimize downtime)• On successful return from sync(2), make /fastboot file• if /fastboot exits, system was shutdown cleanly (don’t fsck)• After successful boot, remove /fastboot file• If /fastboot doesn’t exist, do fsck (only for /etc/fstab)

• Log Structured File System– collect dirty nodes in one big segment (~track size)– periodically write this log to disk

• fast -- no seek/rotational delay

– recovery is fast & complete

sector

DMA

buffer Memory

13

Issues • Transactional guarantee

– Write all, or no write at all – “Account A Account B (transfer $ 100)”– Atomic transaction– Write both or cancel both

• Ordering guarantee– “Delete file A”

1. Modify parent directory’s data block (file name A)2. Release file A’s inode (address of data block sectors, …)3. Release file A’s data block

– Suggested order : (3 2 1), – Otherwise, A’s inode exists, pointer exists, wrong data

…,– Write the next block to disk, only if previous write is

complete synchronous write

** Reference: Vahalia, 11.7.2

directory

a b dev bin7 9 11 45

inode of b

“remove b”

pointers[ ]

data data data

14

Back to buffer cache

15

22

23

88

83

14

25

45

32

74

37

11

19

Free buffers

Some buffers are linked to free buffer pool

17

Process 1

Allocate buffers to whom?

CPU

user

CPU

inodeoffset

dev

Buffer cache

UNIX

Linux

18

11

43

23

33

15

44

54

64

97

10

99

Disk 3

18

Among buf allocated to dev ... some will do (waiting) DMA some is currently doing DMA others has done DMA

Buffer header has flag

(I/O wait queue) within (dev)

19

11

43

23

33

15

44

54

Disk 3

Some buffers are waiting for disk I/O

18I/O waitQueue

Waiting to do DMA

has done DMA

20

struct buf{

int b_flags; /* see defines below */

struct buf *b_forw; /* headed by devtab of b_dev */struct buf *b_back; /* " */struct buf *av_forw; /* position on free list, */struct buf *av_back; /* if not BUSY*/

int b_dev; /* major+minor device name */char *b_blkno; /* block # on device */int b_wcount; /* transfer count (usu. words) */char b_error /* returned after I/O */

char *b_addr; /* low order core address */char *b_xmem; /* high order core address */

} buf[NBUF];struct buf bfreelist;

21

struct devtab{ char d_active; /* busy flag */

char d_errcnt; /* error count (for recovery) */structbuf *b_forw; /* first buffer for this dev */structbuf *b_back; /* last buffer for this dev */structbuf *d_actf; /* head of I/O queue */struct buf *d_actl; /* tail of I/O queue */

};

structdevtab

d_activeb_forwb_backd_actfd_actl

11

43

23

33

15

44

54

64

97

10

99

18

I/O waiting buffers

22

Remember ..OS Kernel

CPU

PCB

mem disk

PCB PCB

tty

Process 1 Process 2 Process 3

CPU mem disk tty

(plain C program with variables and functions)

: Table (Data Structure): Object (hardware or software)

23

Kernel Data Structure

CPU

user

Process 1

CPU

inodeoffset

disk_read ( )

devswtab

devtab

/

bin etc

cc date sh getty passwd

Buffer cache

superblock

inode

data

24

– Each buffer header has 4 link fields– buf can belong to two doubly linked list at a time– read(fd) system call

• get offset• get inode

– checks access permission (rwx rwx rwx)

– mapping: offset sector address – get major/minor device number

• search buffer cache (buffer header has disk & sector #)– start from device table, traverse the links– compare each buffer with sector address

• if already in buffer cache, done• if miss, then arrange to read from disk

user file inode dev

fdoffset

25

– read() system call{fd offset inode device search buffer list}

If (hit) then

done /* return data from buffer cache */ else /* buffer cache miss – must read disk */ if (free buf available?)

then /* using this free buffer, read disk */ get buf read disk fill buf doneelse /* need replacement first */ {get most LRU buffer

If (dirty?) {write old content -first, delayed write} {read disk fill buf done}}

26

mounting

System can have many file systems

Compare with Windows {C: D: E: ...}

27

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

FS 1

FS 2

FS 3

<Logically>

FS

At bootup timespecify which F.S. to bootas a “root file system”

FS

FS

28

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

FS 1

FS 2

FS 3

<Logically>

dsk1

dsk2

dsk3

“root file system”

/

bin etc usr

date sh getty passwd

Now all files under root file system can be accessed

But how do we access files in other file systems?

Windows C: D: E:

29

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

BootblockSuperblockInode list

Data block

FS 1

FS 2

FS 3

<Logically>

dsk1

dsk2

dsk3

/

bin etc usr

date sh getty passwd

/

bin include src

utsstudio.hbanner yacc

/dev/dsk3

Mount it!

30

/

bin etc usr

date sh getty passwd bin include src

utsstudio.hbanner yacc

System callmount (path1, path2, option)

dev special file: /dev/dsk3 (which)

mount point: /usr (where)

example: read-only (how)After mounting,

/dev/dsk3 is accessed as /usr

i-numbersin disk-1

rootsuperblock

i-numbersin disk-2

rootsuperblock

31

/

bin etc usr

date sh getty passwd bin include src

utsstudio.hbanner yacc

Mount Table Entry Purpose:

- resolve pathname- locate superblock

inode (/usr)

inode (root)

superblock

device number

32

Inode table

inodeof /usr

inodeof dsk 3 root

SuperblockMounted on inode

Root inode

Mount table

buf

Relationship between Tables

Buffer Cabe

33

Disk File System

• Boot block• Superblock pointers to free space

in disk• inode list pointers to data block• data block

• mounting file system