USENIX FAST2010参加報告

8th USENIX Conference on File and Storage

Technologies (FAST ’10)

産業技術総合研究所　高野了成

2010年3月31日　仮想化実装技術勉強会＠東大

USENIX

•  主催カンファレンス・シンポジウム –  Annual Technical Conference –  FAST: Conference on File and Storage Technologies –  LISA: Large InstallaEon System AdministraEon Conference –  OSDI: Symposium on OperaEng Systems Design and ImplementaEon

–  NSDI: Symposium on Networked Systems Design and ImplementaEon

–  Security Symposium

•  その他、ワークショップ、共催イベント多数 –  Hot{Cloud, Dep, Mobile, OS, Par, Power, Sec, Storage}

FAST’10 概要

•  日時：2010年2月23~26日 •  場所：米国サンノゼ市 •  概要：ファイルシステムとストレージ技術に関する国際会議。半導体ストレージ利用技術、ストレージ仮想化、障害解析など。

•  論文採択率：20%（18/89） •  参加者：350+名 – 日本からは日立、富士通、NEC、東北大、東大、NEDO、産総研

※論文、スライド、ビデオはhZp://www.usenix.org/ events/fast10/で公開されている

Tutorials

•  Solid-‐State Storage: Technology, Design, and ApplicaBon, Richard Freitas and Larry Chiu, IBM –  フラッシュメモリの次はPCMが来るのか？

•  ASPLOS 2010のBest paperの一つはPCM関連 –  Quick Silverプロジェクトで得たSSD性能に関する知見

•  安物SSDでの「バスタブ」性能問題

•  Storage and Network DeduplicaBon Technologies, Michael Condict, NetApp – データ圧縮とdedup（重複排除）の違い – 各種dedup方式の比較

•  Clustered and Parallel Storage System Technologies, Marc Unangst, Panasas

Keynote Addresses

•  Technology for Developing Regions Eric Brewer, University of California, Berkeley – 開発途上国ためにIT技術で何が貢献できるか

•  WiLDNet：通信距離が伸びても性能劣化し難いWiFi技術 •  TierStore：劣悪な通信環境上での分散FS

•  Enterprise AnalyBcs on Demand Oliver Ratzesberger, eBay, Inc. –  eBay、PayPalのIO intensiveな負荷をさばくインフラ – AnalyEcs as a Service

Best Paper Awards •  quFiles: The Right File at the Right Time

Kaushik Veeraraghavan and Jason Flinn, University of Michigan; Edmund B. NighEngale, MicrosoB Research, Redmond; Brian Noble, University of Michigan –  論理的には同一で、データ表現が異なる複数のファイルを統一的に扱うための抽象化機構

–  事例：Versioning、Security (Context-‐aware)、Odyssey (ApplicaEon-‐aware)、Plahorm-‐specific displayなど

•  Membrane: OperaBng System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C. Arpaci-‐Dusseau, Remzi H. Arpaci-‐Dusseau, and Michael M. Swik, University of Wisconsin—Madison –  ファイルシステムの障害を検出し、再起動する仕組み

•  データチェックポインティング＋オペレーションログ –  マイクロリブートをLinux FS sub systemで実現

Build a BeQer File System and the World Will Beat a Path to Your Door

•  [Best Paper] quFiles: The Right File at the Right Time Kaushik Veeraraghavan and Jason Flinn, University of Michigan; Edmund B. NighHngale, MicrosoB Research, Redmond; Brian Noble, University of Michigan

•  Tracking Back References in a Write-‐Anywhere File System Peter Macko and Margo Seltzer, Harvard University; Keith A. Smith, NetApp, Inc. –  近代的FS（CoW、writable snapshot）のブロック再構成を高速化するために、ブロックからinodeへのback referenceを追加

•  End-‐to-‐end Data Integrity for File Systems: A ZFS Case Study Yupu Zhang, Abhishek Rajimwale, Andrea C. Arpaci-‐Dusseau, and Remzi H. Arpaci-‐Dusseau, University of Wisconsin—Madison –  メモリ破壊すると何が起きるか、fault injecEon使ってZFS上で実験、解析

Looking for Trouble •  Black-‐Box Problem Diagnosis in Parallel File Systems

Michael P. Kasick, Carnegie Mellon University; Jiaqi Tan, DSO NaHonal Labs, Singapore; Rajeev Gandhi and Priya Narasimhan, Carnegie Mellon University –  BlueGene/P PVFSクラスタの経験から生まれた診断方法 –  性能問題がある（”Limping-‐but-‐alive”）IOノードを特定：ストレージとネットワークの性能をサンプリングし、ノード間で相互比較

•  A Clean-‐Slate Look at Disk Scrubbing Alina Oprea and Ari Juels, RSA Laboratories –  Disk Scrubbing：潜在的なセクタエラー(LSE)を早期検出する処理 –  LSEの局所性を利用して、StaggeringとadapEve rateを提案 –  参考：SIGMETRICS’07のL.N.Bairavasundaramらの論文

•  Understanding Latent Sector Errors and How to Protect Against Them Bianca Schroeder, SoErios Damouras, and Phillipa Gill, University of Toronto –  ScrubbingとIntra-‐disk redundancyのシミュレーション評価 –  NetApp提供のデータから見るLSEの傾向：高い局所性（20-‐60%は

10セクタ以内）、ディスクの先頭と末尾に集中（50%以上）

Flash: Savior of the Universe? •  DFS: A File System for Virtualized Flash Storage

William K. Josephson and Lars A. Bongo, Princeton University; David Flynn, Fusion-‐io; Kai Li, Princeton University –  フラッシュストレージ（Fusion-‐io社ioDrive）に特化したストレージマネージャ層＋ファイルシステム

–  将来のストレージインタフェースはどうなる？ •  Extending SSD LifeBmes with Disk-‐Based Write Caches

Gokul Soundararajan, University of Toronto; Vijayan Prabhakaran, Mahesh Balakrishnan, and Ted Wobber, MicrosoB Research Silicon Valley –  Griffin：SSDの延命のためにHDDをキャッシュとして利用 –  その心はsequenEal writeであればHDDは十分高速

•  Write Endurance in Flash Drives: Measurements and Analysis Simona Boboila and Peter Desnoyers, Northeastern University –  USBアナライザを使用して、FTL（Flash TranslaEon Layer）アルゴリズムをリバースエンジニアリング

–  IO遅延の原因：HDD→シーク、フラッシュ→フリーブロック不足によるGC –  フリーブロックの利用を最小にするスケジューリング方式を提案

I/O, I/O, to Parallel I/O We Go •  AcceleraBng Parallel Analysis of ScienBfic SimulaBon Data via Zazen

Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico SacerdoE, and Ron O. Dror, D.E. Shaw Research; David E. Shaw, D.E. Shaw Research and Columbia University –  Zazen：並列ディスクキャッシュシステム

•  Efficient Object Storage Journaling in a Distributed Parallel File System Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, and Ross Miller, NaHonal Center for ComputaHonal Sciences at Oak Ridge NaHonal Laboratory; Oleg Drokin, Lustre Center of Excellence at Oak Ridge NaHonal Laboratory and Sun Microsystems Inc. –  Spider：JaugarのLustreベースストレージシステム。最大スループットは

240GB/s。キーはasynchronous journaling commit •  Panache: A Parallel File System Cache for Global File Access

Marc Eshel, Roger Haskin, Dean Hildebrand, Manoj Naik, Frank Schmuck, and Renu Tewari, IBM Almaden Research –  広域網越しにpNFS（NFS 4.1でサポート）でスパコンのGPFSにアクセス

Making Management More Manageable

•  BASIL: Automated IO Load Balancing Across Storage Devices Ajay GulaE, Chethan Kumar, and Irfan Ahmad, VMware, Inc.; Karan Kumar, Carnegie Mellon University –  IO遅延を主メトリクスにした仮想ディスクの配置と負荷分散の自動化

•  Discovery of ApplicaBon Workloads from Network File Traces Neeraja J. Yadwadkar, Chiranjib BhaZacharyya, and K. Gopinath, Indian InsHtute of Science; Thirumale Niranjan and Sai Susarla, NetApp Advanced Technology Group –  profile HMMを利用して、NFSのopecode traceからユーザ操作（make、

find、tar）を検出 •  Provenance for the Cloud

Kiran-‐Kumar Muniswamy-‐Reddy, Peter Macko, and Margo Seltzer, Harvard School of Engineering and Applied Sciences –  Data provenance（来歴／起源）：データのトレーサビリティみたいな話？

•  参考：併設ワークショップTaPP (Theory and PracEce of Provenance) ’10 –  クラウドストレージ上にDAGベースのワークフローを実現するプロトコル

ConcentraBon: The DeduplicaBon Game

•  I/O DeduplicaBon: UBlizing Content Similarity to Improve I/O Performance Ricardo Koller and Raju Rangaswami, Florida InternaHonal University –  on-‐diskではなく、OSブロックIO層レベルでdedup

•  HydraFS: A High-‐Throughput File System for the HYDRAstor Content-‐Addressable Storage System CrisEan Ungureanu, NEC Laboratories America; Benjamin Atkin, Google; Akshat Aranya, Salil Gokhale, and Stephen Rago, NEC Laboratories America; Grzegorz Całkowski, VMware; Cezary Dubnicki, 9LivesData, LLC; Aniruddha Bohra, Akamai –  CAS製品HYDRAstor（FAST’09参照）上に高性能なファイルシステムを構築。かなり作り込んでいるとのこと

•  Bimodal Content Defined Chunking for Backup Streams Erik Kruus and CrisEan Ungureanu, NEC Laboratories America; Cezary Dubnicki, 9LivesData, LLC –  dedupのチャンクサイズ決定にCDCを利用 –  DERを落とさずにできるだけ大きなチャンクに切りたい

The Power BuQon •  EvaluaBng Performance and Energy in File System Server

Workloads Priya Sehgal, Vasily Tarasov, and Erez Zadok, Stony Brook University –  負荷（web, fs, mail, db）とFS（ext2, ext3, xfs, reiserfs）の組み合わせを変えて、ops/k jouleとops/secを評価

–  結局ops/secというメトリック一本で十分なのでは？ •  SRCMap: Energy ProporBonal Storage Using Dynamic

ConsolidaBon Akshat Verma, IBM Research, India; Ricardo Koller, Luis Useche, and Raju Rangaswami, Florida InternaHonal University –  負荷に比例した電力消費を実現するストレージ仮想化

•  [Best Paper] Membrane: OperaBng System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C. Arpaci-‐Dusseau, Remzi H. Arpaci-‐Dusseau, and Michael M. Swik, University of Wisconsin—Madison

おまけ：Intel Museum

Busicom 141-‐PF（1972） ALTAIR 8800 (1975)

IBM PC (1981)