12
Embedded System Lab. Embedded System Lab. 서서서 [email protected] The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

Embedded System Lab. 서동화 [email protected] The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

Embed Size (px)

Citation preview

Page 1: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

Embedded System Lab.

Embedded System Lab.

서동화[email protected]

The Design and Implementation of a Log-Structured File System

- Mendel Rosenblum and John K. Ousterhout

Page 2: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Contents Background Problem Design of log-structured file system

File location and reading Free space management Segment cleaning mechanism & polices Crash recovery

Checkpoints Roll-forward

Result

Page 3: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Background Disk & File system?

Seek time, rotational latency, transmission time The structure and logic rules used to manage the groups of information and their names is called “file system” File system is used to control how data is stored and retrieved.

Physical

logical

Page 4: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Problem

Over the last decade CPU speeds have increased dramatically while disk access times have only improved slowly.

This trend is likely to continue in the future and it will cause more and more applications to become disk-bound.

Main memory is increasing in size at an exponential rate. Modern file systems cache recently-used file data in main memory

Lager main memory makes lager file caches possible.

To lessen the impact of this problem, this paper have devised a new disk storage management technique called a “log- structured file system”

Page 5: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Problem with existing file system

Current file systems suffer from two general problem that make it hard for them to cope with the technologies and workloads.

First, they spread information around the disk in a way that causes too many small accesses. The second problem with current file systems is that they tend to write synchronously.

As a result this problem make it hard for the application to benefit from faster CPU and large mem-ory.

Page 6: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system File location and reading

Summary of the major data structures stored on disk by Sprite LFS

A comparison between Sprite LFS and Unix FFs

Although the two layouts have the same logical structure, the log-structured file system produces a much more compact ar-rang-ement. As a result, the write performance of Sprite LFS is much better than Unix FFS.

Page 7: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system Free space management

The goal of free space management is to maintain large free extents for writing new data.

Segment cleaning mechanism The process of copying live data out of a segment is called “segment cleaning”

1) Read a number of segments into memory. 2) identify the live data. 3) write the live data back to a smaller number of clean segments.

Sprite LFS writes a “segment summary block” as part of each segment. The summary block identifies each piece of information that is written in the segment. Sprite LFS also uses the segment summary information to distinguish live blocks from those that have been overwritten or

deleted.

Page 8: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system Segment cleaning policies

Which segment should be cleaned? How should the live blocks be grouped when they are written out?

Write cost The write cost is the average amount of time the disk is busy per byte of new data written, including all the cleaning overheads.

(Where u is the utilization of the segment and 0<=u<1)

sensi-tive The performance of a log-structured file system can be

improved by reducing the overall utilization of the disk space.

Trade off between cost and performance. Need to using bimodal segment distribution.

Page 9: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system Simulation results

Uniform : each file has equal likelihood of being selected in each step. Using greedy policy

Hot-and-Cold : this access pattern models a simple form of locality. Using greedy policy and the cleaner also sorts the live data by age before writing it out again.

We realized that hot and cold segments must be treated differently by the cleaner. It is less beneficial to clean a hot segment because the data will likely die quickly and the free space

will rapidly re-accumulate. The stability can be estimated by the age of data.

Free space in a cold segment is more valuable than free space in a hot segment.

Page 10: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system cost-benefit policy

Benefit of cleaning the segment the segment. Cost of cleaning the segment . Choose the highest ratio of benefit to cost.

Segment usage table In order to sort live blocks by age, the segment summary information records the age of the

youngest block written to the segment.

(Where u is the utilization of the segment and 0<=u<1)

Page 11: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Design of log-structured file system Crash recovery

When a system crash occurs, the last few operations performed on the disk. During reboot the operating system must review these operations in order to correct any inconsistencies. In LFS, the locations of the last disk operations are easy to determine

Checkpoints A checkpoint is a position in the log at which all of the file system structures are consistent and complete. It writes out all modified information to the log. It writes a checkpoint region to a special fixed position on disk.

Roll-forward In order to recover as much information as possible, LFS scans through the log segments that were written

after the last checkpoint. During roll-forward, LFS uses the information in segment summary blocks to recover recently-written file data. Restore consistency between directory entries and inodes.

Page 12: Embedded System Lab. 서동화 dhdh0113@gmail.com The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout

서 동 화

Embedded System Lab.

Result Experience with the Sprite LFS

Small file (1kb)

Cleaning overheads

Large file (100 mb)