21
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science Operating Systems CMPSCI 377 Distributed File Systems Emery Berger University of Massachusetts Amherst

Operating Systems - Advanced File Systems

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Operating SystemsCMPSCI 377

Distributed File SystemsEmery Berger

University of Massachusetts Amherst

Page 2: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Distributed File Systems

Numerous drawbacks of local file systems:

Inconvenient

Administrative overhead

Single point-of-failure

Solution: distributed file systems

FS appears to be local, but data is remote

Two major implementations:

Windows

NFS (Sun’s Network File System)

2

Page 3: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Complications

Distributed file systems add complexity& many design tradeoffs

Naming – absolute vs. relative (to server)

Remote access vs. caching

Stateless or stateful server

Single image or replication

3

Page 4: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4

Naming & Transparency

Issues

How are files named?

Do filenames reveal location?

Do filenames change if file moves?

Do filenames change if user moves?

Page 5: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5

Location naming

Location transparency:filename does not revealphysical storage location Normal in Unix

Compare to Windows - C:\foo\bar

Provides location independence:no change if file’s storage location changes

Page 6: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6

Windows: Absolute Names

Disadvantages: User must know

complete name –local & remote different

Location dependent (cannot move file)

Makes sharing harder Not fault-tolerant

Advantages: Easy to find fully

specified filename Easy to add & delete

new names No global state Scales easily

\\machine name\remote pathname

\\loki\home\emery

Page 7: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7

NFS: Relative Names

Advantages:

Location transparent

Remote name can change across reboots

/nfs/sting/users1/emery

Disadvantages:

Admin overhead

Page 8: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8

NFS: Relative Names

Implemented via mount points

one level of indirection!

Each host: local names! remote locations

Mount table (/etc/fstab)

<remote pathname @ machine, local pathname>

/courses/cs300/cs377

% cat /etc/fstabelsrv4:/courses /courses nfs intr,hard,rw 0 0elsrv4:/courses/cs100_200 /courses/cs100_200 nfs intr,hard,rw 0 0elsrv4:/courses/cs300 /courses/cs300 nfs intr,hard,rw 0 0

Page 9: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9

NFS Example

Page 10: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

URLs Viewed as File System

Uniform Resource Locator names increasingly standard way to access data

protocol://machine/path/to/file

Good? Bad?

Looks like Windows… same?

10

Page 11: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11

Distributed File Systems: Issues

Naming & transparency

Remote file access & caching

Server with state or without

Replication

Page 12: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12

Remote File Access & Caching

Can access files two ways

Remotely: returns results using RPC

Locally: transfer part of file = caching

Caching issues:

Performance: Where & when to cache file blocks?

Correctness: When to propagate updates back to remote file?

What happens when multiple clients cache same file?

Page 13: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13

Remote File Caching

Local disk: Reduces access time (compared to remote) Safe if node fails– Difficult to keep copy consistent with remote file– Requires client to have disk (…)

Local memory: Quick Works without disks– Difficult to keep copy consistent with remote file– Smaller cache size– Not fault-tolerant

Page 14: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14

Cache Update Policies

Write-through: always write to remote disk

Reliable

– Low-performance = remote service for all writes

Write-back: write only to cache

Write to disk on evictions, periodic sync

Quick

Reduces network traffic (n writes to same block)

– User machine crashes ) data loss

Page 15: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15

Cache Consistency

Client-initiated consistency:client contacts server and checks consistency every access

at given intervals

only upon opening a file

Server-initiated consistency:server detects potential conflicts,invalidates caches Server needs to know:

which clients have cached which parts of which files, plus

which clients are readers & which are writers

Page 16: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16

Case Study: Network File System

NFS: standard for distributed UNIX file access

Designed to run on LANs

Nodes: both servers & clients

Servers have no state = no info about clients

Uses mount protocol to make global name local /etc/exports

local names server willing to export

/etc/fstab

global names that local nodes import

global name must be in /etc/exports on server

Page 17: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17

NFS Implementation

Set of RPC operations for remote file access: Directory search, reading directory entries Manipulating links & directories Accessing file attributes Reading/writing files

Page 18: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18

NFS Implementation

Page 19: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19

The End

Page 20: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20

Global Name Space

Single name space: Examples:

AFS (CMU’s Andrew File System)

Sprite (Berkeley)

No matter which node you are on,filenames remain the same

Client: gets filename structure from server(s)

When users access files, server sends copies to workstation, where they are cached

Page 21: Operating Systems - Advanced File Systems

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21

Global Name Space: Pros & Cons

Advantages:

Naming – consistent

Ensures all files are same regardless of where you login

Late binding of names )moving them is easier

Disadvantages:

Difficult for OS to keep files consistent (caching)

Global name space may limit flexibility

Performance issues