48
分散仮想ストレージシステム またはSoftware Defined Storage ご紹介資料 OSSラボ株式会社 2014/10/30 1 14/10/30 Copyright 2014(C) OSS Laboratories Inc. All Rights Reserved http://www.ossl.co.jp TWITTER: http://twitter.com/satoruf LINKEDIN: http://jp.linkedin.com/in/satorufunai/ja SLIDESHARE: http://www.slideshare.net/sfunai FACEBOOK: http://www.facebook.com/satoru.funai

141030ceph

  • Upload
    oss

  • View
    918

  • Download
    5

Embed Size (px)

DESCRIPTION

2014/10/30 第六回クラウドストレージ研究会発表資料 「分散仮想ストレージシステム またはSoftware Defined Storage  Ceph」ご紹介資料

Citation preview

  • 1. Software Defined StorageOSS2014/10/3014/10/30Copyright 2014(C) OSS Laboratories Inc. All Rights 1Reserved http://www.ossl.co.jpTWITTER: http://twitter.com/satorufLINKEDIN: http://jp.linkedin.com/in/satorufunai/jaSLIDESHARE: http://www.slideshare.net/sfunaiFACEBOOK: http://www.facebook.com/satoru.funai

2. l l Disk Virtualization :(LBA)l Block Virtualization :LVM/RAIDl File Virtualization VFSl File System Virtualization (SNIAStorage Network Industry Association)l l /Software Defined Storage, etc.l l l EMC ViPR, scaleIOl VMware VSANl Datacore SANSymphony-Vl NexentaStorl Cleversafe/Amplidatal 14/10/30Copyright 2014(C) OSS Laboratories Inc. All Rights 2Reserved 3. l l l l IDl l POSIX/FUSE/Block/RESTl l WORM/Transactional locking/Leasingl l Read only/Read/WriteCAPPAXOSl l l Self Healing14/10/30Copyright 2014(C) OSS Laboratories Inc. All Rights 3Reserved 4. 14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 4 Amage Ceph Inktank LGPL2Chiron FS [email protected] GPL3Cloudian CloudStore/Kosmosfs/Quantcastfs Quantcast Apache License 2.0Cosmos Microsoft internal dCache DESY and others FraunhoferFS (FhGFS) Competence Center for High Performance Computing FhGFS licenseFS-Manager CDNetworks General Parallel File System(GPFS) IBM Gfarm file system / BSDGlusterFS Gluster, a company acquired by Red Hat GPL3Google File System(GFS) Google Hadoop Distributed File System ASF, Cloudera, Pivot, Hortonworks, WANdisco, Intel Apache License 2.0IBRIX Fusion IBRIX LeoFS Apache License 2.0Lustreoriginally developed byCluster File System and currently supportedby Intel(formerly Whamcloud)GPL 5. 14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 5 1Amazon S3Andrew File System(AFS) etc.Microsoft DFSMoose FS, etc.FraunhoferFS(FhGFS)PVFS/OrangeFSetc.CephGlusterFSsheepdog, XtreemFSetc. 6. Ceph6RHEL-OSPCertificationFEB 2014OpenStackIntegration 2011MAY 2012Launch ofInktank2010MainlineLinuxKernelOpenSource2006 2004ProjectStartsatUCSCProductionReadyCeph SEPT2012XenIntegration20132012CloudStackIntegrationOCT 2013Inktank CephEnterpriseLaunchAPR 2014InktankAcquired byRed HatCopyright 2014 by Inktank 7. Ceph14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 7 8. Cephl l PCOS(Linux)l l l l l l POSIXl l REST(AWS S3/SWIFT)l WANRGWl RW(LAN14/10/30Copyright 2014(C) OSS Laboratories Inc. All Rights 8Reserved 9. CephAPIKernelClientRADOSReliable, Autonomous, Distributed Object StoreMonitorsOSDsMDSMonitorsOSDsMDSMonitorsOSDsMDSvfs9CephFSPOSIXKernel-clientFUSE-clientLibradosC, C++, java,Python, Ruby, PHPRADOSAPIRADOSGW(RADOSGateway)AWS S3OpenStackSWIFTREST APIhttpRBD(RADOS BlockDevice)LinuxRADOS APIQEMU/KVMKernelClientFUSEClientFile systemFile systemvfs 10. CALAMARI10Copyright 2014 by Inktank 11. 14/10/30CephTCP/IPTCPCopyright 2014(C) OSS Laboratories Inc. All RightsReserved 11HypervisorVMVMVMFC/iSCSI SANRAIDHypervisorVMVMVMRAIDHypervisorVMVMVMLinuxHypervisorVMVMVMLinuxFW 12. QEMU/KVM14/10/30HypervisorVMLinuxKernelKernelClientClientlibrbdCopyright 2014(C) OSS Laboratories Inc. All RightsReserved 12RBD(RADOS Block Device)VMLinuxLinuxLinuxRBD CacheHypervisorVM OSHypervisorQEMU/KVM 13. Read/WriteWriteReadCEPH STORAGE CLUSTERSSDHDD 14. Erasure codingl Erasure Codingl Azurel 20%~40%()l Erasure Coding(2)140%200%CPUHighLow 15. Replication vs ERASURE CODING15CEPH STORAGE CLUSTERCEPH STORAGE CLUSTERFull copies of stored objectsOne copy plus parity Very high durability Cost-effective durability Quicker recovery Expensive recoveryCopyright 2014 by Inktank 16. Multi-site Replication(RADOSGW)l l Read Onlyl l l AP(Eventuallyconsistency)14/10/30Copyright 2014(C) OSS Laboratories Inc. All Rights 16Reserved 17. CephTraditional14/10/30CephTargetNASObjectContent Store(traditional NAS)Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 17Virtualization andPrivate Cloud(traditional SAN/NAS)High Performance(traditional SAN)CapacityPerformanceITCloudApplicationsXaaS Compute CloudOpen Source BlockXaaS Content StoreOpen Source NAS/ObjectCeph Target 18. Web18S3/SwiftS3/SwiftS3/SwiftS3/SwiftCopyright 2014 by Inktank 19. 19NativeProtocolNativeProtocolNativeProtocolNativeProtocolCopyright 2014 by Inktank 20. CEPH STORAGE CLUSTERCopyright 2014 by Inktank20 21. 21CEPH STORAGE CLUSTERCEPH STORAGE CLUSTERSite ASite BCopyright 2014 by Inktank 22. VM22NativeProtocolNativeProtocolNativeProtocolNativeProtocolCopyright 2014 by Inktank 23. 14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 23http://www.mellanox.com/related-docs/whitepapers/WP_Deploying_Ceph_over_High_Performance_Networks.pdf 24. 14/10/302,419MB/sec (8M Seq. READ)110k IOPS (4k Seq. READ)Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 24 25. Incremental Object size test One Client 180 OSDs, 1 x replicated poolMonitor NodesPrivate Network (192.168.50)Client NodeOSD NodesPublic Network (172.27.50)3GB/s1GB/s12GB/s 6GB/shttp://www.slideshare.net/Inktank_Ceph/06-ceph-day-october-8th-2014-smc?qid=34fdee3f-a686-4738-b0b1-a02032480876v=qf1b=from_search=5 26. Incremental Object size test One Client 180 OSDs,1 x erasure coded pool (k=4,m=2)Monitor NodesPrivate Network (192.168.50)Client NodeOSD NodesPublic Network (172.27.50)3GB/s1GB/s12GB/s 6GB/s 27. CEPH+OPENSTACK27VolumesEphemeralCopy-on-Write SnapshotsRADOS CLUSTER 28. OPENSTACK USER SURVEY, 05/201428DEV / QAPROOF OF CONCEPTPRODUCTION 29. GlanceRBD/etc/glance/glance-api.confdefault_store=rbdrbd_store_user=glancerbd_store_pool=imagesStore, DownloadGlance ServerImage 30. Ceph COW clonel COW:Copy-on-writel 14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 30READWRITE 31. Cinderl Ceph COW clone14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 31 32. Cinder snapshot/backup on Cephl CephRBD Snapshotl RBD snapshotl SnapshotCOWclonel Cinder backup on Cephl Ceph RBD snapshotl PG14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 32 33. Cinder/NovaVMlibrbdRBDNova/computeVMLibvirt(QEMU/KVM)Cinder ServerVMBoot from volumeVolumeImageCopy-on-write clonelibrbdVMLibvirt(QEMU/KVM) 34. SWIFT/KeystoneSWIFTCeph RADOS GWKeystone ServerRADOSGWRESTful Object StoreQuery tokenAccess with tokenGrant/revoke 35. OpenstackCephHA ProxyNW14/10/30PublicCephCopyright 2014(C) OSS Laboratories Inc. All RightsReserved 35l l RBDCephl l l l l Cinder/Glancel librbdl Swiftl RADOS GWHA Proxy10gbCephCeph 36. OpenStack+Cephl OpenStack l MySQL l OS Ceph l Compute Ceph l Network l 14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 36 37. OPENSTACK ADDITIONSl JUNOl Enable Cloning for rbd-backed ephemeral disksl KILOl Volume Migration from One Backend to Anotherl Implement proper snapshotting for Ceph-based ephemeraldisksl Improve Backup in CinderCopyright 2014 by Inktank 38. CoreOS Cephl CoreOSl CoreOS Alex Polvi OS Linux Distribution OSS l CoreOS Google Facebook l l CoreOS l l l ( Docker )l l l SDKl DockerDBl AWS/GCPl Ceph-FUSEPOSIXCoreOS423.0.0Ceph14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 38 39. CoreOS http://qiita.com/satoruf/items/437d634c70bb8e501b6914/10/30KernelClientKernelClientCopyright 2014(C) OSS Laboratories Inc. All RightsReserved 39RBD(RADOS Block Device)CoreOSDockerlibrbdCoreOSDockerlibrbd/dev/rbd 40. CEPH ROADMAP as of 2014/10402014/112015/3?GiantHammerI-ReleaseCopyright 2014 by Inktank 41. GIANT41 Tree frozen September 9 0.85 dev release includes RDMA support groundwork Improved SSD performance Improvements to stand alonecivetweb-based RGW frontend New osd blocked by command 0.86 released 07 Oct (GiantRC) Low level OSD debugging tool Local repairable codes (LRC) Librados locking refactor MDS and mon improvements 42. RBD42l Client-side cachingl (Now enabled by default!)l New option that makes cachewrite-through until flushedl Eucalyptus supportl https://mdshaonimran.wordpress.com/2014/09/17/eucalyptus-block-storage-service-with-ceph-rbd/ 43. RGW43l Stand-alone civetwebfront endl Civetweb embedded C/C++ web serverl No need for Apacheoverhead, dependencies, etc 44. CEPHFS44l Lots of activity!l 1/3 of core team assigned herel A lot of outside commitsl Inktank / Red Hat team usingCephFS internally on QAinfrastructurel Sanding rough edgesl Not Supported vs NotReadyl Feedback encouraged 45. CephFS Dogfoodingl Using CephFS for internal build/test labl 80 TB (80 x 1 TB HDDs, 10 hosts)l Old, crummy hardware with lots of failuresl Linux kernel clients (ceph.ko, bleeding edge kernels)l Lots of good lessonsl Several kernel bugs foundl Recovery performance issuesl Lots of painful admin processes identifiedl Several fat fingers, facepalms14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 45 46. l Openstack+Cephl http://www.slideshare.net/sfunai/openstackcephl Cephl http://www.slideshare.net/sfunai/ceph-33123790l Openstack+Cephl http://www.slideshare.net/sfunai/openstackceph-39609805l Ceph/GlusterFS/XtreemFSl https://s3-ap-northeast-1.amazonaws.com/cloudconductorwpupdate/whitepaper/%E5%88%86%E6%95%A3%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB%E3%82%B7%E3%82%B9%E3%83%86%E3%83%A0%E3%81%AEWAN%E8%B6%8A%E3%81%88%E5%90%8C%E6%9C%9F%E3%83%AC%E3%83%97%E3%83%AA%E3%82%B1%E3%83%BC%E3%82%B7%E3%83%A7%E3%83%B3%E3%81%AE%E6%A4%9C%E8%A8%BC.pdf14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 46 47. Cephl l https://groups.google.com/forum/#!forum/ceph-jpl ceph14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 47 48. Ceph!!14/10/30Copyright 2014(C) OSS Laboratories Inc. All RightsReserved 4,4097,00048