55
计计计计•计计计计计计计 Lecture 15 File Systems xlanchen@06/03/2005

计算机系 信息处理实验室 Lecture 15 File Systems xlanchen@06/03/2005

Embed Size (px)

Citation preview

计算机系•信息处理实验室

Lecture 15 File Systems

xlanchen@06/03/2005

xlanchen@06/03/2005 Understanding the Inside of Windows2000

2计算机系信息处理实验室

Contents

Windows 2000 File System Formats

NTFS Design Goals and Features

File System Driver Architecture

NTFS File System Driver

NTFS On-Disk Structure

xlanchen@06/03/2005 Understanding the Inside of Windows2000

3计算机系信息处理实验室

Windows 2000 File System Formats

CDFS (CD-ROM File System)

1988, read-only formatting standard for CD-ROM media

UDF

FAT12, FAT16, and FAT32

NTFS

the native file system format of Windows 2000

xlanchen@06/03/2005 Understanding the Inside of Windows2000

4计算机系信息处理实验室

FAT series (12, 16, 32)

FAT format organization

Example: file allocation table

xlanchen@06/03/2005 Understanding the Inside of Windows2000

5计算机系信息处理实验室

NTFS Design Goals

Recoverability

Security

Data Redundancy

Fault Tolerance

xlanchen@06/03/2005 Understanding the Inside of Windows2000

6计算机系信息处理实验室

NTFS Features

Multiple data streams

Unicode-based names

General indexing facility

Dynamic bad-cluster remapping

Hard links and junctions

Compression and sparse files

Change logging

Per-user volume quotas

Link tracking

Encryption

POSIX support

Defragmentation

xlanchen@06/03/2005 Understanding the Inside of Windows2000

7计算机系信息处理实验室

NTFS filesNTFS file = set of ($attribute-name, data) pairs

Std attributes include:FileName, flags, data, MSDOS-name, ACL, other…

Filename up to 255 characters

Values (data) held in MFT entry if possible

Otherwise:Data attribute = set of (start-VCN, start-LCN, #clusters)

Allows sequence of VCNs to be discovered

Provides VCN->LCN->cluster mapping

“attribute list” attribute added if MFT entry too small

Points to (first) overflow MFT rec for mappings

xlanchen@06/03/2005 Understanding the Inside of Windows2000

8计算机系信息处理实验室

NTFS Directories

An index of filenames

Index blocks organized as balanced (b+) tree

Tree pointer gives (VCN,LCN) of next block

Index entries contain:

File Reference Number, plus

Size, timestamp, etc (for directory browsing) (saves reading MFT rec to find file attributes)

NT4 indexes on filename only

NT5 indexes on other file attributes also

xlanchen@06/03/2005 Understanding the Inside of Windows2000

9计算机系信息处理实验室

File System Driver (FSD) Architecture

FSDs manage file system formats

Kernel mode

Two different types of FSD (2K)

Local FSDs

Remote FSDs

xlanchen@06/03/2005 Understanding the Inside of Windows2000

10计算机系信息处理实验室

Local FSDs

A local FSD must register with the I/O manager

Local FSDs include:

Ntfs.sys, Fastfat.sys, Udfs.sys, Cdfs.sys, and the Raw FSD (integrated in Ntoskrnl.exe).

xlanchen@06/03/2005 Understanding the Inside of Windows2000

11计算机系信息处理实验室

Local FSD & other concept of the OS

Boot sector

I/O manager

volume parameter block (VPB)

Storage device file system device

Cache manager

xlanchen@06/03/2005 Understanding the Inside of Windows2000

12计算机系信息处理实验室

Remote FSDs

Remote FSDs consist of two components:

a client and a server

Client-side remote FSD (2K: LANMan Redirector)

allows applications to access remote files and directories

Accepts I/O request & translates into network commands

Server-side FSD (2K: LANMan Server)

Listens and fulfills the command

xlanchen@06/03/2005 Understanding the Inside of Windows2000

13计算机系信息处理实验室

Remote FSD operation

xlanchen@06/03/2005 Understanding the Inside of Windows2000

14计算机系信息处理实验室

EXPERIMENT

Viewing the List of Registered File Systems

xlanchen@06/03/2005 Understanding the Inside of Windows2000

15计算机系信息处理实验室

File system operationTwo ways

Directly, file I/O functions

Indirectly, file mapping

An FSD can be invoked through several paths

Explicit file I/O

From the memory manager's modified page writer

Indirectly from the cache manager's lazy writer

Indirectly from the cache manager's read-ahead thread

From the memory manager's page fault handler

xlanchen@06/03/2005 Understanding the Inside of Windows2000

16计算机系信息处理实验室

Components involved in file system I/O

xlanchen@06/03/2005 Understanding the Inside of Windows2000

17计算机系信息处理实验室

NTFS FSD

Components of the Windows 2000 I/O system

Layereddrivers

xlanchen@06/03/2005 Understanding the Inside of Windows2000

18计算机系信息处理实验室

NTFS and related components

xlanchen@06/03/2005 Understanding the Inside of Windows2000

19计算机系信息处理实验室

Log File Service (LFS)

NTFS provides file system recoverability by means of a transaction-processing technique called logging

LFS is a series of kernel-mode routines inside the NTFS driver

xlanchen@06/03/2005 Understanding the Inside of Windows2000

20计算机系信息处理实验室

NTFS data structures

xlanchen@06/03/2005 Understanding the Inside of Windows2000

21计算机系信息处理实验室

NTFS On-Disk Structure

Volumes

Clusters

Directories

The storage of actual file data and attribute information

NTFS data compression

xlanchen@06/03/2005 Understanding the Inside of Windows2000

22计算机系信息处理实验室

Volumes

A logical partition on a disk

A disk can have one volume or several

stores all file system data as ordinary files, such as

Bitmaps 、 directories 、 system bootstrap

xlanchen@06/03/2005 Understanding the Inside of Windows2000

23计算机系信息处理实验室

Clusters

Cluster size established when formatting

Also called cluster factor

2n sectors

Cluster vs. sector

NTFS is independent from physical sector sizes

LCN, logical cluster numbers

the numbering of all clusters of the volume

VCN, virtual cluster numbers

Number the clusters belonging to a particular file

xlanchen@06/03/2005 Understanding the Inside of Windows2000

24计算机系信息处理实验室

Master File Table (MFT)

All data is contained in files, including metadata

Metadata

the data structures used to locate and retrieve files,

the bootstrap data,

the bitmap that records the allocation state of the entire volume

Easy to locate and maintain

Each can be protected by a security descriptor

xlanchen@06/03/2005 Understanding the Inside of Windows2000

25计算机系信息处理实验室

Master File Table (MFT)

MFT, the heart

Implemented as an array of file records

File records, fixed size, 1KB

Logically, contains one record for each file including the MFT itself.

Metadata files (with name prefixed with “$”)

$Mft

xlanchen@06/03/2005 Understanding the Inside of Windows2000

26计算机系信息处理实验室

Metadata files in MFT

xlanchen@06/03/2005 Understanding the Inside of Windows2000

27计算机系信息处理实验室

Mount a volume with MFT

find the physical disk address of the MFT from the boot sector

Find information inside the file record of MFT

Open more metadata file

Perform the file system recovery operation

Open other metadata file

xlanchen@06/03/2005 Understanding the Inside of Windows2000

28计算机系信息处理实验室

Other metadata files

log file ($LogFile)

root directory ("\")

bitmap file ($Bitmap)

security file ($Secure)

boot file ($Boot)

bad-cluster file ($BadClus)

extensions ($Extend), a metadata directory

object identifier file ($ObjId), the quota file ($Quota), the change journal file ($UsnJrnl), and the reparse point file ($Reparse).

xlanchen@06/03/2005 Understanding the Inside of Windows2000

29计算机系信息处理实验室

File record vs. File

Normally, 1:1

May n:1

If a file has a large number of attributes

or becomes highly fragmented

First one called base file record

stores the locations of the others

Others extended file record

xlanchen@06/03/2005 Understanding the Inside of Windows2000

30计算机系信息处理实验室

File Reference Numbers

A file on an NTFS volume is identified a file reference

64-bit

File number, index to the file's file record position in the MFT

Sequence number, the reused times of an MFT file record position

xlanchen@06/03/2005 Understanding the Inside of Windows2000

31计算机系信息处理实验室

File Records

File, a collection of attribute/value pairs

Filename

time stamp information

unnamed data attribute

additional named data attributes

xlanchen@06/03/2005 Understanding the Inside of Windows2000

32计算机系信息处理实验室

Attribute Attribute Name Description

Volume information $VOLUME_INFORMATION, $VOLUME_NAME

These attributes are present only in the $Volume metadata file. They store volume version sand label information.

Standard information

$STANDARD_INFORMATION File attributes such as read-only, archive, and so on; time stamps, including when the file was created or last modified; and how many directories point to the file (its hard link count).

Filename $FILE_NAME The file's name in Unicode characters. A file can have multiple filename attributes, as it does when a hard link to a file exists or when a file with a long name has an automatically generated "short name" for access by MS-DOS and 16-bit Microsoft Windows applications.

Security descriptor $SECURITY_DESCRIPTOR This attribute is present for backward compatibility with previous versions of NTFS. The Windows 2000 version of NTFS stores all security descriptors in the $Secure metadata file, sharing descriptors among files and directories that have the same settings. Previous versions of NTFS stored private security descriptor information with each file and directory.

Data $DATA The contents of the file. In NTFS, a file has one default unnamed data attribute and can have additional named data attributes; that is, a file can have multiple data streams. A directory has no default data attribute but can have optional named data attributes.

Index root, index allocation, and index bitmap

$INDEX_ROOT, $INDEX_ALLOCATION, $BITMAP

Three attributes used to implement filename allocation and bitmap indexes for large directories (directories only).

xlanchen@06/03/2005 Understanding the Inside of Windows2000

33计算机系信息处理实验室

Attribute Attribute Name Description

Attribute list $ATTRIBUTE_LIST A list of the attributes that make up the file and the file reference of the MFT file record in which each attribute is located. This seldom-used attribute is present when a file requires more than one MFT file record.

Object ID $OBJECT_ID A 64-byte identifier for a file or directory, with the lowest 16 bytes (128 bits) unique to the volume. The link-tracking service assigns object IDs to shell shortcut and OLE link source files. NTFS provides APIs so that files and directories can be opened with their object ID rather than their filename.

Reparse information $REPARSE_POINT This attribute stores a file's reparse point data. NTFS junctions and mount points include this attribute.

Extended attributes $EA, $EA_INFORMATION Extended attributes aren't actively used but are provided for backward compatibility with OS/2 applications.

EFS information $LOGGED_UTILITY_STREAM EFS stores data in this attribute that's used to manage a file's encryption, such as the encrypted version of the key needed to decrypt the file and a list of users that are authorized to access the file. The word logged is in the attribute's name because changes to this attribute are recorded in the volume log file (described later in this chapter) for recoverability.

xlanchen@06/03/2005 Understanding the Inside of Windows2000

34计算机系信息处理实验室

Attribute streams

Each file attribute is stored as a separate stream of bytes within a file

Streams is the unit of file operation

create, delete, read and write

Attribute type code

The file attributes in an MFT record are ordered by type codes

Attribute

Type code: value: optional name

xlanchen@06/03/2005 Understanding the Inside of Windows2000

35计算机系信息处理实验室

Filenames

NTFS, <=255 characters

xlanchen@06/03/2005 Understanding the Inside of Windows2000

36计算机系信息处理实验室

Resident and Nonresident Attributes

resident attribute

the value of an attribute is stored directly in the MFT

Example:

xlanchen@06/03/2005 Understanding the Inside of Windows2000

37计算机系信息处理实验室

Several attributes are defined as always being resident

The standard information

Index root attributes

xlanchen@06/03/2005 Understanding the Inside of Windows2000

38计算机系信息处理实验室

Resident attribute header and value

Example: filename attribute

Only need once disk accessing

xlanchen@06/03/2005 Understanding the Inside of Windows2000

39计算机系信息处理实验室

MFT file record for a small directory

If a particular attribute is too large

Clusters outside MFT is allocated, called run

If the value size grows, more runs is allocated

This is nonresident attributes

xlanchen@06/03/2005 Understanding the Inside of Windows2000

40计算机系信息处理实验室

Resident or nonresident

Determined by the file system

Location transparent to the process

Example:

MFT file record for a large file with two data runs

xlanchen@06/03/2005 Understanding the Inside of Windows2000

41计算机系信息处理实验室

A large directory can also have nonresident attributes

Example: MFT file record for a large directory with a nonresident filename index

xlanchen@06/03/2005 Understanding the Inside of Windows2000

42计算机系信息处理实验室

Keeping track of the runs

VCN-to-LCN mapping pairs

VCN

LCN

Example:

VCNs & LCNs for a nonresident data attribute

xlanchen@06/03/2005 Understanding the Inside of Windows2000

43计算机系信息处理实验室

VCN-to-LCN mappings

xlanchen@06/03/2005 Understanding the Inside of Windows2000

44计算机系信息处理实验室

Indexing

A file directory is simply an index of filenames

Example:

Filename index for a volume's root directory

xlanchen@06/03/2005 Understanding the Inside of Windows2000

45计算机系信息处理实验室

Index root attribute

For large directories:

Index buffer + B+ tree

With index root attribute contains the first level of B+ tree

File index entry:

The file reference in the MFT

Time stamp

File size information

xlanchen@06/03/2005 Understanding the Inside of Windows2000

46计算机系信息处理实验室

The index allocation attribute

For index buffer runs: VCNLCN

xlanchen@06/03/2005 Understanding the Inside of Windows2000

47计算机系信息处理实验室

Data Compression and Sparse Files

NTFS supports compression

per-file, per-directory, or per-volume

Related func

GetVolumeInformation

GetCompressedFileSize

DeviceIoControl

xlanchen@06/03/2005 Understanding the Inside of Windows2000

48计算机系信息处理实验室

Compressing Sparse Data

Sparse data is often large but contains only a small amount of nonzero data relative to its size

Runs of a noncompressed file and the related MFT record

xlanchen@06/03/2005 Understanding the Inside of Windows2000

49计算机系信息处理实验室

Compression technique

to remove long strings of zeros from the file

NTFS allocates space only for runs that contain nonzero data

So certain ranges of the file's VCNs have no disk allocations

xlanchen@06/03/2005 Understanding the Inside of Windows2000

50计算机系信息处理实验室

example

Runs of a compressed file containing sparse data

16~31

64~127

xlanchen@06/03/2005 Understanding the Inside of Windows2000

51计算机系信息处理实验室

The related MFT record

Read & write operation

For hole, return zero/ allocate & write

No mapping for 16~31; 64~127

xlanchen@06/03/2005 Understanding the Inside of Windows2000

52计算机系信息处理实验室

Compressing Nonsparse Data

Nonsparse data can also be compressed

Individual files or a whole directory

Compressing technology

compression units = 16 clusters long

For true compressing

at least save 1 cluster of storage

xlanchen@06/03/2005 Understanding the Inside of Windows2000

53计算机系信息处理实验室

example

Data runs of a compressed file

Actual storage

xlanchen@06/03/2005 Understanding the Inside of Windows2000

54计算机系信息处理实验室

For compressed file

Runs must started at a virtual 16-cluster boundary

Read/write unit: compressed unit (why 16 clusters)

Example: the related MFT record

xlanchen@06/03/2005 Understanding the Inside of Windows2000

55计算机系信息处理实验室

Reparse Points

$REPARSE_POINT attribute

A block of up to 16 KB of application-defined reparse data and a 32-bit reparse tag

\$Extend\$Reparse metadata file