28
Copyright 2009 ITRI 工工工工工工工 2012/8/11 OpenStack APAC Conference 2012 1 1 Distributed Block-level Storage Management for OpenStack OpenStack APAC Conference 2012 Daniel Lee CCMA/ITRI Cloud Computing Center for Mobile Applications Industrial Technology Research Institute ( 工工工工工工工工工工工工 )

Danile lee -open stackblocklevelstorage

Embed Size (px)

Citation preview

Page 1: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

1 1

Distributed Block-level Storage Management for OpenStack

OpenStack APAC Conference 2012

Daniel Lee

CCMA/ITRICloud Computing Center for Mobile Applications

Industrial Technology Research Institute(雲端運算行動運用科技中心 )

Page 2: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

2

Outline

• Overview• Brief on ITRI Cloud OS• Cloud storage system in ITRI Cloud OS• Distributed Main Storage System• Distributed Secondary Storage System• Integration of Cloud OS Storage System with

OpenStack• Summary• Q&A

Page 3: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

3

Overview

• Data is growing exponentially and moving on to Could. It needs to be available at all time, secured and protected.

• 911 in Twin Towers, 311 earthquake in Japan and 420 outage of Amazon datacenters are disasters.– Block-level and fully redundant– Backup / restore + wide-area– Thin provisioning– De-duplication– Others

• Cloud OS Storage System has these functionality and is ready to integrate to OpenStack

Page 4: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

4

ITRI Cloud OSITRI Cloud OS(An All-in-one IaaS Total Solution)(An All-in-one IaaS Total Solution)

Physical ResourceManagement(HP/Dell)

Physical ResourceManagement(HP/Dell)

Inter-PM Load Balancingand VM Fail-over (VMWare)

Inter-PM Load Balancingand VM Fail-over (VMWare)

Virtual Data Center Provisioning (VMWare)

Virtual Data Center Provisioning (VMWare)

Power Management(IBM)

Power Management(IBM)

Inter-VM LoadBalancing (F5)

Inter-VM LoadBalancing (F5)

Security (Checkpoint)

Security (Checkpoint)

Virtual Data Center ManagementVirtual Data Center Management

Physical Data CenterManagement (IBM)

Physical Data CenterManagement (IBM)

Primary/SecondaryStorage Management(EMC)

*Blue name in bracket indicates the leading company of the component

*Blue name in bracket indicates the leading company of the component

Page 5: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

5

ITRI Cloud OSITRI Cloud OSService InterfaceService Interface

Photo SharingVDC

Provision and Deploy

Monitor and ConfigureVirtual Resources

Video StreamingVDC

Web ConferenceVDC

Physical Cluster

Virtual Data Center ManagementVirtual Data Center ManagementVirtual Data Center ManagementVirtual Data Center Management Physical Data Center ManagementPhysical Data Center ManagementPhysical Data Center ManagementPhysical Data Center Management

Multiplexing VDC’s in a Physical DCMultiplexing VDC’s in a Physical DC Multiplexing VDC’s in a Physical DCMultiplexing VDC’s in a Physical DC

•Cloud Application Developer•Cloud Service Provider

•Cloud Service

Infrastructure Administrator•Carrier

Monitor, Diagnose and ConfigurePhysical Resources

Page 6: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

6

VDCM – Assets (VDC, VC, VM)

Page 7: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

7

VDCM – System Images

Page 8: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

8

VDCM - Volumes

Page 9: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

9

Cloud Storage System in Cloud OS

• Cloud storage aims at cloud-scale data centers, and is designed to be scalable, available and low-cost.

• Main components– Distributed Main Storage subsystem – DMS

• Provide current image I/O request processing– Distributed Secondary Storage subsystem – DSS

• Block-level incremental metadata only backup system• Take volume snapshot / restore, de-duplication engine,

garbage collection and WADB engine

Page 10: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

10

Key Features of Cloud Storage System from Users’ Perspective

• Use like raw, unformatted block devices • Support volume size from 1 GB to 2 TB (64TB architecture-wise)• Can be attached by different instances (one at a time for now)• Multiple volumes can be mounted to the same instance• Data is automatically replicated on different physical storage

server (N=3)• Ability to clone volumes and/or create point-in-time snapshots of

volumes• Wide-area backup for disaster recovery• Snapshot can be scheduled by assigning a backup policy with time,

frequency, retention, and WADB option• Save system image for starting multiple VMs using the same image

Page 11: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

11

Key Features of Cloud Storage System from Providers’ Perspective

• Use of commodity hardware – reducing cost (JBOD)

• Disk space management for up to multiple petabytes

• Add disk, remove disk without interruption• Thin provisioning• Enables you to provision a specific level of I/O

performance if desired, by setting I/O throttle• Block-level de-duplication for reducing space

requirement

Page 12: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

12

DMS in Cloud OS

• DMS - Distributed Main Storage• Goal: A cloud scale network storage solution that provide

high capacity, high reliability, high availability and high performance

• Features:– Volume operations: Create / Attach / Detach / Delete– High Reliability: data replication (N=3)– High Availability: No single point of failure– High Performance: client-side metadata/data cache for performance– Thin provisioning is utilized to optimize utilization of available storage.– Fast Volume Cloning– Save Image; FastBoot– Disk I/O Throttle

Page 13: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

13

Compute Node

DMS In-kernel client

VM Volume

Service Node

Storage Node

VM Volume

VM Volume

NameNode

DataNode

Metadata

Data

DMS System Architecture

DMS three components:1. Namenode2. DMS in-kernel client3. Datanode

RR

Page 14: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

14

In-Kernel Client

Block Devices

.....

Domain 0 Kernel Space

IOCTL

Domain 0 User Space

Domain U Kernel Space

Compute Node

Applications

Read/Write Sectors /dev/hda1

/dev/dms01

/dev/dms02

Block FrontendDriver

Domain U User Space

Xen / KVM I/O Virtualization

DataNodes

NameNode

Attach a disk to Dom U

User-space Daemon

RPM

Attach/Detach

Volume

Metadata

Payload

Block BackendDriver

DMS Compute Node Architecture

Page 15: Danile lee -open stackblocklevelstorage

Copyright 2008 ITRI 工業技術研究院 2012/8/11

OpenStack APAC Conference 201215

Compute Node

DMS In-kernel Client

VM Volume Service Node

Storage Node

NameNode

DataNode

2. query metadata: (VolID, LBID)

3. HBID, N-locations

4. Read Data

7. Report failed locations

Read Flow

Storage Node

DataNode

Storage Node

DataNode

6. Response (payload) 1. Read request

5. Read Data from another datanode if previous-try failed

Page 16: Danile lee -open stackblocklevelstorage

Copyright 2008 ITRI 工業技術研究院 2012/8/11

OpenStack APAC Conference 201216

Compute Node

DMS In-kernel Client

VM Volume Service Node

Storage Node

NameNode

DataNode

2. Request for new space:(VolID, LBID)

3. HBID, N-locations

4. Write Data

7. Commit (Success, or report failed locations)

Write Flow – Memory-Ack Approach

Storage Node

DataNode

Storage Node

DataNode

5. Memory Ack

6. Disk Ack

6. Response 1. Write request

• Memory-ack approach: In client, I/O response to VM(step6) is right after receiving memory ack from Datanodes (step5) .

Page 17: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

17

DSS in Cloud OS

• Share the homogeneous space with DMS• Block-level Incremental backup

– Copy-on-write (COW) and volume cloning techniques– Backup is just a snapshot of meta-data

• Support volume-level restore• Block-level de-duplication• Garbage collection of expired, un-used and redundant disk

blocks• Wide-area data backup / restore and near-realtime

replication• The secondary storage itself is failure-tolerant (HA)

Page 18: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

18

DSS – Backup Policy

Page 19: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

19

DSS – Assign Backup Policy

Page 20: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

20

DSS – WADB

Backing up the data to a remote location at the time volume snapshot is taken…

Network / VPN

Network / VPN

BeijingBeijing Shang-HaiShang-Hai

snapshotsnapshot

restorerestore

Page 21: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

21

DSS – Enabling WADB

Page 22: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

De-duplication in DropBox

22

2012/8/11 OpenStack APAC Conference 2012

Page 23: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

23

DSS Software Architecture DiagramBackup/RestoreRequest

Backup Policy

SOAP server

Snapshot

Garbage Collection

Thread Manager / Flow Control

DSS Daemon

Tread

DSS main

Classes

RPC

DMS client protocol

DMS RS

RS client Lib

DB manager

De-dupScheduler

De-dup Engine

Restore

Timer Fired

DSS Manager

Quartz Scheduler

WADB

Page 24: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

24

OpenStack

Compute Storage Network

Identity Security

Web Interface

Object(Swift) Block FileHypervisors Architectures

(ARM, GPU)Switch Router Firewall

Page 25: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

2012/8/11 OpenStack APAC Conference 2012

25

Integration of Cloud OS Storage System with OpenStack

User portalUser portal

SECS

VDCM PDCM

VMM

VM1 VM2 VMn

SLB

DMS

… …volume

RS

PRML2

LOGUser PDCM User

Glance

Nova ComputeNova Volume

Nova Manager

Nova API

DB

Nova Network

Web Services(OpenStack API enabled)

Page 26: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

Summary of AdvantagesFeature BenefitsUnlimited storage highly scalable read/write access

Leverages commodity hardware (JBOD) No lock-in, lower price/GB

Built-in replication Fully redundant (default: N=3)

Used as system volume Save Image, Fast Boot, etc.

Thin Provisioning Optimize utilization of available storage

Highly available (HA) storage servers (DMS/DSS)

Server failover for high available

Snapshot and backup for block volumes Data protection and recovery for VM data

Policy based backup Free of worry – set it and forget about it

Wide-area data backup For disaster recovering

Block-level data de-duplication Store more data in the limited space

Standalone volume API, CLI available Easy integration with other compute systems

2012/8/11 OpenStack APAC Conference 2012

26

Page 27: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

More Feature in V2.0Feature BenefitsVolume Cloning Fast cloning without copying data

Share virtual disk within cluster Support sharing of volume and cluster file systemDisk bandwidth QoS guarantee Dynamic adjustment at I/O request levelStorage space optimization N-way replication for write-intensive blocks

N-parity-group for read-intensive blocks

Wide-area data replication Near real-time wide-area replication

Integration with Compute / Volume Fully integrated to Compute for attaching block volumes and reporting on usage

2012/8/11 OpenStack APAC Conference 2012

27

Page 28: Danile lee -open stackblocklevelstorage

Copyright 2009 ITRI 工業技術研究院

Q&A

Thank You !