Upload
emery-berger
View
2.496
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Operating SystemsCMPSCI 377
Distributed File SystemsEmery Berger
University of Massachusetts Amherst
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Distributed File Systems
Numerous drawbacks of local file systems:
Inconvenient
Administrative overhead
Single point-of-failure
Solution: distributed file systems
FS appears to be local, but data is remote
Two major implementations:
Windows
NFS (Sun’s Network File System)
2
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Complications
Distributed file systems add complexity& many design tradeoffs
Naming – absolute vs. relative (to server)
Remote access vs. caching
Stateless or stateful server
Single image or replication
3
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4
Naming & Transparency
Issues
How are files named?
Do filenames reveal location?
Do filenames change if file moves?
Do filenames change if user moves?
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
Location naming
Location transparency:filename does not revealphysical storage location Normal in Unix
Compare to Windows - C:\foo\bar
Provides location independence:no change if file’s storage location changes
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6
Windows: Absolute Names
Disadvantages: User must know
complete name –local & remote different
Location dependent (cannot move file)
Makes sharing harder Not fault-tolerant
Advantages: Easy to find fully
specified filename Easy to add & delete
new names No global state Scales easily
\\machine name\remote pathname
\\loki\home\emery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7
NFS: Relative Names
Advantages:
Location transparent
Remote name can change across reboots
/nfs/sting/users1/emery
Disadvantages:
Admin overhead
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8
NFS: Relative Names
Implemented via mount points
one level of indirection!
Each host: local names! remote locations
Mount table (/etc/fstab)
<remote pathname @ machine, local pathname>
/courses/cs300/cs377
% cat /etc/fstabelsrv4:/courses /courses nfs intr,hard,rw 0 0elsrv4:/courses/cs100_200 /courses/cs100_200 nfs intr,hard,rw 0 0elsrv4:/courses/cs300 /courses/cs300 nfs intr,hard,rw 0 0
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9
NFS Example
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
URLs Viewed as File System
Uniform Resource Locator names increasingly standard way to access data
protocol://machine/path/to/file
Good? Bad?
Looks like Windows… same?
10
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
Distributed File Systems: Issues
Naming & transparency
Remote file access & caching
Server with state or without
Replication
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
Remote File Access & Caching
Can access files two ways
Remotely: returns results using RPC
Locally: transfer part of file = caching
Caching issues:
Performance: Where & when to cache file blocks?
Correctness: When to propagate updates back to remote file?
What happens when multiple clients cache same file?
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
Remote File Caching
Local disk: Reduces access time (compared to remote) Safe if node fails– Difficult to keep copy consistent with remote file– Requires client to have disk (…)
Local memory: Quick Works without disks– Difficult to keep copy consistent with remote file– Smaller cache size– Not fault-tolerant
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
Cache Update Policies
Write-through: always write to remote disk
Reliable
– Low-performance = remote service for all writes
Write-back: write only to cache
Write to disk on evictions, periodic sync
Quick
Reduces network traffic (n writes to same block)
– User machine crashes ) data loss
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
Cache Consistency
Client-initiated consistency:client contacts server and checks consistency every access
at given intervals
only upon opening a file
Server-initiated consistency:server detects potential conflicts,invalidates caches Server needs to know:
which clients have cached which parts of which files, plus
which clients are readers & which are writers
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
Case Study: Network File System
NFS: standard for distributed UNIX file access
Designed to run on LANs
Nodes: both servers & clients
Servers have no state = no info about clients
Uses mount protocol to make global name local /etc/exports
local names server willing to export
/etc/fstab
global names that local nodes import
global name must be in /etc/exports on server
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
NFS Implementation
Set of RPC operations for remote file access: Directory search, reading directory entries Manipulating links & directories Accessing file attributes Reading/writing files
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
NFS Implementation
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
The End
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
Global Name Space
Single name space: Examples:
AFS (CMU’s Andrew File System)
Sprite (Berkeley)
No matter which node you are on,filenames remain the same
Client: gets filename structure from server(s)
When users access files, server sends copies to workstation, where they are cached
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
Global Name Space: Pros & Cons
Advantages:
Naming – consistent
Ensures all files are same regardless of where you login
Late binding of names )moving them is easier
Disadvantages:
Difficult for OS to keep files consistent (caching)
Global name space may limit flexibility
Performance issues