Transcript
Page 1: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

國立台灣大學資訊工程學系

Chapter 10: File-System InterfaceChapter 11: File System Implementation

Page 2: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

國立台灣大學資訊工程學系

Chapter 10: File-System Interface

Page 3: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Objectives

To explain the function of file systemsTo describe the interfaces to file systemsTo discuss file-system design tradeoffs, including access methods, file sharing, file locking, and directory structuresTo explore file-system protection

/653

Page 4: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Chapter 10: File-System Interface

File ConceptAccess MethodsDirectory StructureFile-System MountingFile SharingProtection

/654

Page 5: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Concept

Contiguous logical address space

Types: Data

numericcharacterbinary

Program

/655

Page 6: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File StructureNone - sequence of words, bytesSimple record structure

Lines Fixed lengthVariable length

Complex StructuresFormatted documentRelocatable load file

Can simulate last two with first method by inserting appropriate control charactersWho decides:

Operating systemProgram

/656

Page 7: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File AttributesName – only information kept in human-readable formIdentifier – unique tag (number) identifies file within file systemType – needed for systems that support different typesLocation – pointer to file location on deviceSize – current file sizeProtection – controls who can do reading, writing, executingTime, date, and user identification – data for protection, security, and usage monitoringInformation about files are kept in the directory structure, which is maintained on the disk

/657

Page 8: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Operations

File is an abstract data typeCreateWriteReadReposition within fileDeleteTruncateOpen(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memoryClose (Fi) – move the content of entry Fi in memory to directory structure on disk

/658

Page 9: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Open Files

Several pieces of data are needed to manage open files:

File pointer: pointer to last read/write location, per process that has the file openFile-open count: counter of number of times a file is open – to allow removal of data from open-file table when last processes closes itDisk location of the file: cache of data access informationAccess rights: per-process access mode information

/659

Page 10: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Open File Locking

Provided by some operating systems and file systemsMediates access to a fileMandatory or advisory:

Mandatory – access is denied depending on locks held and requestedAdvisory – processes can find status of locks and decide what to do

/6510

Page 11: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Locking Example – Java API (1/2)

import java.io.*;import java.nio.channels.*;public class LockingExample {

public static final boolean EXCLUSIVE = false;public static final boolean SHARED = true;public static void main(String arsg[]) throws IOException {

FileLock sharedLock = null;FileLock exclusiveLock = null;try {

RandomAccessFile raf = new RandomAccessFile("file.txt", "rw");

// get the channel for the fileFileChannel ch = raf.getChannel();// this locks the first half of the file - exclusiveexclusiveLock = ch.lock(0, raf.length()/2, EXCLUSIVE);/** Now modify the data . . . */// release the lockexclusiveLock.release();

/6511

Page 12: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Locking Example – Java API (2/2)

// this locks the second half of the file - sharedsharedLock = ch.lock(raf.length()/2+1, raf.length(),

SHARED);/** Now read the data . . . */// release the locksharedLock.release();

} catch (java.io.IOException ioe) { System.err.println(ioe);

}finally { if (exclusiveLock != null)exclusiveLock.release();if (sharedLock != null)sharedLock.release();

}}

}

/6512

Page 13: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Types – Name, Extension

/6513

Page 14: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Access Methods

Sequential Accessread nextwrite next resetno read after last write

(rewrite)Direct Access

read nwrite nposition to n

read nextwrite next

rewrite nn = relative block number

/6514

Page 15: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Sequential-access File

/6515

Page 16: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Simulation of Sequential Access on Direct-access File

/6516

Page 17: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Example of Index and Relative Files

/6517

Page 18: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Directory StructureA collection of nodes containing information about all files

F 1 F 2F 3

F 4

F n

Directory

Files

Both the directory structure and the files reside on diskBackups of these two structures are kept on tapes

/6518

Page 19: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Disk StructureDisk can be subdivided into partitionsDisks or partitions can be RAID protected against failureDisk or partition can be used raw – without a file system, or formatted with a file systemPartitions also known as minidisks, slicesEntity containing file system known as a volumeEach volume containing file system also tracks that file system’s info in device directory or volume table of contentsAs well as general-purpose file systems there are many special-purpose file systems, frequently all within the same operating system or computer

/6519

Page 20: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

A Typical File-system Organization

/6520

Page 21: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Operations Performed on Directory

Search for a fileCreate a fileDelete a fileList a directoryRename a fileTraverse the file system

/6521

Page 22: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Organize the Directory (Logically) to Obtain

Efficiency – locating a file quicklyNaming – convenient to users

Two users can have same name for different filesThe same file can have several different names

Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)

/6522

Page 23: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Single-Level Directory

A single directory for all users

Naming problem

Grouping problem

/6523

Page 24: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Two-Level DirectorySeparate directory for each user

Path name Can have the same file name for different user Efficient searching No grouping capability

/6524

Page 25: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Tree-Structured Directories (1/3)

/6525

Page 26: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Tree-Structured Directories (2/3)

Efficient searching

Grouping Capability

Current directory (working directory)cd /spell/mail/progtype list

/6526

Page 27: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Tree-Structured Directories (3/3)Absolute or relative path nameCreating a new file is done in current directoryDelete a file

rm <file-name>Creating a new subdirectory is done in current directory

mkdir <dir-name>Example: if in current directory /mail

mkdir count

mail

prog copy prt exp count

Deleting “mail” deleting the entire subtree rooted by “mail”

/6527

Page 28: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Acyclic-Graph Directories (1/2) Have shared subdirectories and files

/6528

Page 29: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Acyclic-Graph Directories (2/2)

Two different names (aliasing)If dict deletes list dangling pointerSolutions:

Backpointers, so we can delete all pointersVariable size records a problemBackpointers using a daisy chain organizationEntry-hold-count solution

New directory entry typeLink – another name (pointer) to an existing fileResolve the link – follow pointer to locate the file

/6529

Page 30: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

General Graph Directory (1/2)

/6530

Page 31: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

General Graph Directory (2/2)

How do we guarantee no cycles?Allow only links to file not subdirectoriesGarbage collectionEvery time a new link is added use a cycle detection algorithm to determine whether it is OK

/6531

Page 32: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File System Mounting

A file system must be mounted before it can be accessed

Mount Point

(a) Existing. (b) Unmounted Partition /6532

Page 33: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Sharing

Sharing of files on multi-user systems is desirable

Sharing may be done through a protection scheme

On distributed systems, files may be shared across a network

Network File System (NFS) is a common distributed file-sharing method

/6533

Page 34: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Sharing – Multiple UsersUser IDs identify users, allowing permissions and protections to be per-user

Group IDs allow users to be in groups, permitting group access rights

/6534

Page 35: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Sharing – Remote File SystemsUses networking to allow file system access between systems

Manually via programs like FTPAutomatically, seamlessly using distributed file systemsSemi automatically via the world wide web

Client-server model allows clients to mount remote file systems from servers

Server can serve multiple clientsClient and user-on-client identification is insecure or complicatedNFS is standard UNIX client-server file sharing protocolCIFS is standard Windows protocolStandard operating system file calls are translated into remote calls

Distributed Information Systems (distributed naming services) such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing

/6535

Page 36: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Sharing – Failure Modes

Remote file systems add new failure modes, due to network failure, server failureRecovery from failure can involve state information about status of each remote requestStateless protocols such as NFS include all information in each request, allowing easy recovery but less security

/6536

Page 37: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File Sharing – Consistency SemanticsConsistency semantics specify how multiple users are to access a shared file simultaneously

Similar to Ch 7 process synchronization algorithmsTend to be less complex due to disk I/O and network latency (for remote file systems)

Andrew File System (AFS) implemented complex remote file sharing semanticsUnix file system (UFS) semantics implements:

Writes to an open file visible immediately to other users of the same open fileSharing file pointer to allow multiple users to read and write concurrently

AFS has session semanticsWrites only visible to sessions starting after the file is closed

Immutable semantics

/6537

Page 38: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Protection

File owner/creator should be able to control:what can be doneby whom

Types of accessReadWriteExecuteAppendDeleteList

/6538

Page 39: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Access Lists and GroupsMode of access: read, write, executeThree classes of users

RWXa) owner access 7 1 1 1

RWXb) group access 6 1 1 0

RWXc) public access 1 0 0 1

Ask manager to create a group (unique name), say G, and add some users to the group.For a particular file (say game) or subdirectory, define an appropriate access.

owner group public

chmod 761 game

Attach a group G to a file game chgrp G game

/6539

Page 40: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Windows 7 Access-control List Management

/6540

Page 41: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

A Sample UNIX Directory Listing

/6541

Page 42: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

國立台灣大學資訊工程學系

End of Chapter 10

Page 43: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

國立台灣大學資訊工程學系

Chapter 11: File System Implementation

Page 44: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Objectives

To describe the details of implementing local file systems and directory structuresTo describe the implementation of remote file systemsTo discuss block allocation and free-block algorithms and trade-offs

/6544

Page 45: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Chapter 11: File System Implementation

File-System StructureFile-System Implementation Directory ImplementationAllocation MethodsFree-Space Management Efficiency and PerformanceRecoveryLog-Structured File SystemsNFSExample: WAFL File System

/6545

Page 46: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File-System Structure

File structureLogical storage unitCollection of related information

File system resides on secondary storage (disks)File system organized into layersFile control block – storage structure consisting of information about a file

/6546

Page 47: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Layered File System

/6547

Page 48: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

A Typical File Control Block

/6548

Page 49: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

In-Memory File System Structures

The following figure illustrates the necessary file system structures provided by the operating systems.

Figure 12-3(a) refers to opening a file.

Figure 12-3(b) refers to reading a file.

/6549

Page 50: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

In-Memory File System Structures

/6550

opening a file

reading a file

Page 51: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Virtual File Systems

Virtual File Systems (VFS) provide an object-oriented way of implementing file systems.

VFS allows the same system call interface (the API) to be used for different types of file systems.

The API is to the VFS interface, rather than any specific type of file system.

/6551

Page 52: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Schematic View of Virtual File System

/6552

Page 53: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Directory Implementation

Linear list of file names with pointer to the data blocks.

simple to programtime-consuming to execute

Hash Table – linear list with hash data structure.decreases directory search timecollisions – situations where two file names hash to the same locationfixed size

/6553

Page 54: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Allocation Methods

An allocation method refers to how disk blocks are allocated for files:

Contiguous allocation

Linked allocation

Indexed allocation

/6554

Page 55: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Contiguous Allocation

Each file occupies a set of contiguous blocks on the diskSimple – only starting location (block #) and length (number of blocks) are requiredRandom accessWasteful of space (dynamic storage-allocation problem)Files cannot grow

/6555

Page 56: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Contiguous Allocation

Mapping from logical to physical

LA/512

Q

R

Block to be accessed = ! + starting addressDisplacement into block = R

/6556

Page 57: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Contiguous Allocation of Disk Space

/6557

Page 58: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Extent-Based Systems

Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme

Extent-based file systems allocate disk blocks in extents

An extent is a contiguous block of disksExtents are allocated for file allocationA file consists of one or more extents.

/6558

Page 59: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Linked Allocation (1/2)Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk.

pointerblock =

/6559

Page 60: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Linked Allocation (2/2)Simple – need only starting addressFree-space management system – no waste of space No random accessMapping

Block to be accessed is the Qth block in the linked chain of blocks representing the file.Displacement into block = R + 1

File-allocation table (FAT) – disk-space allocation used by MS-DOS and OS/2.

LA/511Q

R

/6560

Page 61: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Linked Allocation

/6561

Page 62: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

File-Allocation Table

/6562

Page 63: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Indexed Allocation (1/2)Brings all pointers together into the index block.Logical view.

index table

/6563

Page 64: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Example of Indexed Allocation

/6564

Page 65: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Indexed Allocation (2/2)Need index tableRandom accessDynamic access without external fragmentation, but have overhead of index block.Mapping from logical to physical in a file of maximum size of 256K words and block size of 512 words. We need only 1 block for index table.

LA/512Q

R

Q = displacement into index tableR = displacement into block

/6565

Page 66: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Indexed Allocation – Mapping (1/3)

Mapping from logical to physical in a file of unbounded length (block size of 512 words).Linked scheme – Link blocks of index table (no limit on size).

LA / (512 x 511)Q1

R1

Q1 = block of index tableR1 is used as follows:

R1 / 512Q2

R2

Q2 = displacement into block of index tableR2 displacement into block of file:

/6566

Page 67: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Indexed Allocation – Mapping (2/3)Two-level index (maximum file size is 5123)

LA / (512 x 512)Q1

R1

Q1 = displacement into outer-indexR1 is used as follows:

R1 / 512Q2

R2

Q2 = displacement into block of index tableR2 displacement into block of file:

/6567

Page 68: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Indexed Allocation – Mapping (3/3)

outer-index

index table file

/6568

Page 69: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Combined Scheme: UNIX (4K bytes per block)

/6569

Page 70: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Free-Space Management (1/3)Bit vector (n blocks)

0 1 2 n-1

bit[i] =

0 block[i] free1 block[i] occupied

Block number calculation

(number of bits per word) *(number of 0-value words) +offset of first 1 bit

/6570

Page 71: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Free-Space Management (2/3)

Bit map requires extra spaceExample:

block size = 212 bytesdisk size = 230 bytes (1 gigabyte)n = 230/212 = 218 bits (or 32K bytes)

Easy to get contiguous files Linked list (free list)

Cannot get contiguous space easilyNo waste of space

Grouping Counting

/6571

Page 72: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Free-Space Management (3/3)Need to protect:

Pointer to free listBit map

Must be kept on diskCopy in memory and disk may differCannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk

Solution:Set bit[i] = 1 in diskAllocate block[i]Set bit[i] = 1 in memory

/6572

Page 73: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Directory Implementation

Linear list of file names with pointer to the data blockssimple to programtime-consuming to execute

Hash Table – linear list with hash data structuredecreases directory search timecollisions – situations where two file names hash to the same locationfixed size

/6573

Page 74: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Linked Free Space List on Disk

/6574

Page 75: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Efficiency and Performance

Efficiency dependent on:disk allocation and directory algorithmstypes of data kept in file’s directory entry

Performancedisk cache – separate section of main memory for frequently used blocksfree-behind and read-ahead – techniques to optimize sequential accessimprove PC performance by dedicating section of memory as virtual disk, or RAM disk

/6575

Page 76: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Page Cache

A page cache caches pages rather than disk blocks using virtual memory techniques

Memory-mapped I/O uses a page cache

Routine I/O through the file system uses the buffer (disk) cache

This leads to the following figure

/6576

Page 77: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

I/O Without a Unified Buffer Cache

/6577

Page 78: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Unified Buffer Cache

A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O

/6578

Page 79: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

I/O Using a Unified Buffer Cache

/6579

Page 80: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Recovery

Consistency checking – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies

Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, other magnetic disk, optical)

Recover lost file or disk by restoring data from backup

/6580

Page 81: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Log Structured File SystemsLog structured (or journaling) file systems record each update to the file system as a transactionAll transactions are written to a log

A transaction is considered committed once it is written to the logHowever, the file system may not yet be updated

The transactions in the log are asynchronously written to the file system

When the file system is modified, the transaction is removed from the log

If the file system crashes, all remaining transactions in the log must still be performed

/6581

Page 82: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

The Sun Network File System (NFS)

An implementation and a specification of a software system for accessing remote files across LANs (or WANs)

The implementation is part of the Solaris and SunOS operating systems running on Sun workstations using an unreliable datagram protocol (UDP/IP protocol and Ethernet

/6582

Page 83: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS (2/3)

Interconnected workstations viewed as a set of independent machines with independent file systems, which allows sharing among these file systems in a transparent manner

A remote directory is mounted over a local file system directory The mounted directory looks like an integral subtree of the local file system, replacing the subtree descending from the local directory

Specification of the remote directory for the mount operation is nontransparent; the host name of the remote directory has to be provided

Files in the remote directory can then be accessed in a transparent manner

Subject to access-rights accreditation, potentially any file system (or directory within a file system), can be mounted remotely on top of any local directory

/6583

Page 84: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS (3/3)NFS is designed to operate in a heterogeneous environment of different machines, operating systems, and network architectures; the NFS specifications independent of these media

This independence is achieved through the use of RPC primitives built on top of an External Data Representation (XDR) protocol used between two implementation-independent interfaces

The NFS specification distinguishes between the services provided by a mount mechanism and the actual remote-file-access services

/6584

Page 85: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Three Independent File Systems

/6585

Page 86: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Mounting in NFS

Mounts Cascading mounts

/6586

Page 87: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS Mount ProtocolEstablishes initial logical connection between server and clientMount operation includes name of remote directory to be mounted and name of server machine storing it

Mount request is mapped to corresponding RPC and forwarded to mount server running on server machine Export list – specifies local file systems that server exports for mounting, along with names of machines that are permitted to mount them

Following a mount request that conforms to its export list, the server returns a file handle—a key for further accessesFile handle – a file-system identifier, and an inode number to identify the mounted directory within the exported file systemThe mount operation changes only the user’s view and does not affect the server side

/6587

Page 88: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS ProtocolProvides a set of remote procedure calls for remote file operations. The procedures support the following operations:

searching for a file within a directory reading a set of directory entries manipulating links and directories accessing file attributesreading and writing files

NFS servers are stateless; each request has to provide a full set of arguments

(NFS V4 is just coming available – very different, stateful)Modified data must be committed to the server’s disk before results are returned to the client (lose advantages of caching)The NFS protocol does not provide concurrency-control mechanisms

/6588

Page 89: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Three Major Layers of NFS Architecture

UNIX file-system interface (based on the open, read, write, and close calls, and file descriptors)

Virtual File System (VFS) layer – distinguishes local files from remote ones, and local files are further distinguished according to their file-system types

The VFS activates file-system-specific operations to handle local requests according to their file-system types Calls the NFS protocol procedures for remote requests

NFS service layer – bottom layer of the architectureImplements the NFS protocol

/6589

Page 90: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Schematic View of NFS Architecture

/6590

Page 91: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS Path-Name Translation

Performed by breaking the path into component names and performing a separate NFS lookup call for every pair of component name and directory vnode

To make lookup faster, a directory name lookup cache on the client’s side holds the vnodes for remote directory names

/6591

Page 92: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

NFS Remote OperationsNearly one-to-one correspondence between regular UNIX system calls and the NFS protocol RPCs (except opening and closing files)NFS adheres to the remote-service paradigm, but employs buffering and caching techniques for the sake of performance File-blocks cache – when a file is opened, the kernel checks with the remote server whether to fetch or revalidate the cached attributes

Cached file blocks are used only if the corresponding cached attributes are up to date

File-attribute cache – the attribute cache is updated whenever new attributes arrive from the serverClients do not free delayed-write blocks until the server confirms that the data have been written to disk

/6592

Page 93: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Example: WAFL File System

Used on Network Appliance “Filers” – distributed file system appliances“Write-anywhere file layout”Serves up NFS, CIFS, (http, ftp)Random I/O optimized, write optimized

NVRAM for write cachingSimilar to Berkeley Fast File System, with extensive modifications

/6593

Page 94: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

The WAFL File Layout

/6594

Page 95: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

資工系網媒所 NEWS實驗室

Snapshots in WAFL

/6595

Page 96: Chapter 10:   File-System Interface Chapter 11:   File System Implementation

國立台灣大學資訊工程學系

End of Chapter 11


Recommended