001 hbase introduction

HBase IntroductionScott Miao

2012/06/25

Agenda• Course Credit• One common web site story• Why RDB not affordable ?• Big Data• Why use noSQL ?• HBase Indroduction• Hands-on• noSQL architecture common practices• Case study

2

一個網站的故事 (1/3)• RDBMS 是 Persistence tier 一個理所當然的選擇• 它可以幫我們處理 transaction(ACID) ，確保完整性限制

(Integrity Constraints) ，標準的 SQL 語言，甚至還有 Stored Procedure 可以用

• 第一次，你的使用者人數越來越多時…• 使用 AP Servers Cluster ，它們共用一台 DB Server

• 第二次，你的使用者人數越來越多時…• DB Server 分成 Master-Slave 架構

• 從 Slave Servers 讀取資料• 寫入資料至 Master Server

Hbase: The Definitive Guide - http://www.amazon.com/HBase-Definitive-Guide-Lars-George/dp/1449396100/ref=sr_1_1?ie=UTF8&qid=1339060175&sr=8-1

3

http://www.amazon.com/HBase-Definitive-Guide-Lars-George/dp/1449396100/ref=sr_1_1?ie=UTF8&qid=1339060175&sr=8-1


一個網站的故事 (2/3)• 第三次，你的使用者人數越來越多時…• 針對讀取資料的瓶頸

• 在 Server 程式和 DB 之間，加入 Cache ，例如 Memcached (Memory DB)

• 但 Server 程式的 Cache 和 DB 之間，很可能出現資料不一致的問題

• 針對寫入資料的瓶頸• 增加 DB Server 的機器規格 (CPU 、 Memory 、 Disk 等， Vertically

Scaling)• 別忘記！我們也要連同 Slave Severs 的規格也要一起增加ㄛ…

4

一個網站的故事 (3/3)• 第四次，你的使用者人數越來越多時…• 使用 Database Sharding 技術

• 從 Vertically Scaling 轉換成 Horizontally Scaling• 開啟管理的惡夢• RDBMS 天生不適合分散式儲存 (ACID ， Fixed Schema)• DBA 要設定一組 Sharding Rules

• 當其中某一台 DB Server 掛掉，或是儲存容量滿了，就要開始手動作Resharding

• Resharding 包含了要重新調整 Sharding Rules ，接著需要作大量 IO 的資料複製和遷移工作，同時間要保證網站可以正常服務，或是要在一定時間內中斷服務

• 這通常是事後不得已，而且少數可選擇的解決方案• 天知道我的網站會這麼紅？

5

Why RDB not affordable ? (1/6)• Bottleneck of Relational-DB• 90s V.S. recent years (Web 2.0)

• Memcachd + mySQL• Mitigate read stress effectively, but not write stress

• mySQL Cluster solution• Master/Slave

• Not affordable for highly-concurrency scenario• Vertical Partitioning• Vertical/Horizontal Partitioning (Database sharding)

• Complex• Hard to scale-out and change requirements• Low availability

• Some type of simple but big size data cause this conditionhttp://www.infoq.com/cn/news/2011/01/nosql-why

6

http://www.infoq.com/cn/news/2011/01/nosql-why

Why RDB not affordable ? (2/6) – A general HA system architecture design

軟體專案的素質之四 ─ 整體設計之架構設計案例 ─ http://takeshi-experience.blogspot.tw/2012/04/blog-post.html

7

http://takeshi-experience.blogspot.tw/2012/04/blog-post.html

Why RDB not affordable ? (3/6) – Master/Slave

8

Why RDB not affordable ? (4/6) – Vertical Partitioning

9

Why RDB not affordable ? (5/6) – Master/Slave + Vertical Partitioning

10

Why RDB not affordable ? (6/6) – Vertical/Horizontal Partitioning

11

• 過去 3年所產生的資料量，比過去四萬年創造的資料量還多！

• WallMart的資料量是美國國會圖書館的 167倍！• eBay分析平台每天處理的資料量高達 100PB！ (約

1,000,000GB)• 截至 2010年，世界電子資料儲存量為 1.2ZB！

(1,200,000PB)• 根據 IDC預測， 2020年世界電子資料儲存量會是

2009年的基礎上，再加上 44倍，達到 35萬億GB！• 35,000,000,000,000 Giga Bytes

架构师 10 月刊 ─ http://www.infoq.com/cn/minibooks/architect-oct-10-2011

大資料時代！

12

http://www.infoq.com/cn/minibooks/architect-oct-10-2011

Trend Micro’s problem• 每人每天造訪約 20 ~ 60 html 頁面• 每個 html 頁面約包含 15 ~ 30 URI• 每個 URI 物件大小約 10 ~ 150 KB• 以一百萬個用戶而言• 100 萬 X 20 = 2,000 萬個 html 頁面• 2,000 萬個 html 頁面 X 15 = 30,000 萬個 URI ( 三十億 )• 30,000 萬個 URI 物件 X 10 = 30,000KB (3TB)

• 以上純屬台灣區的資料量

• 趨勢是個全球性的公司• 故每天的資料量約數十個 TB

趨勢的雲端發現之旅 ─ http://findbook.tw/book/9789866126185/basic

13

http://findbook.tw/book/9789866126185/basic

http://findbook.tw/book/9789866126185/basic

大資料時代下的新寵兒 ─

• Not only SQL• 於 2009 年開始• 有以下特性• 不使用關聯式資料模型• 天生分散式儲存• 易於水平式擴充的• 開放原始碼的• 易於擴充的• 簡單的 API 操作 (CRUD ，通常沒有 SQL 支援 )• CAP ( 不同於 ACID)

• Eventually Consistency 、 Availability 、 Partition-Tolerance• 儲存巨量且異質的資料

http://nosql-database.org/

14

http://nosql-database.org/

Why use noSQL ?• Easy to scale-out• Unlike RDB, no relationship therefore easy to scale-out

• High performance even in the big data• Table-level cache (RDB) V.S. Record-level cache (noSQL)

• Elastic data model• Schema V.S. Schema-less/Dynamic schema

• High availability• Easy to add new machines (nodes) without any performance

impact15

Comparison between RDB and noSQL

Aspects RDB noSQL

Performance

Scalability

Reliability

Availability

Security

Economics

Data Model

Maturity

Commercial supportOLAP/BI

Human resource

If given a really huge of big data…

Getting lower Sustain as a small size of data

Mainly for scale up Mainly for scale out

ACID CAP

Hard to maintain SLA Easy to maintain SLA

Robust Depends

High-end machines Commodity machines

Relational, Fix-schema Depends but more likely simple, Schema-less

Very mature Not mature, various products

Global company Small start-ups

Mature Immature

Easy to find Hard to find

16

noSQL basic categories

iTcloud 新雲端時代 ─ http://www.ithome.com.tw/002/cloud/cloud.html

17

http://www.ithome.com.tw/002/cloud/cloud.html

Apache Hbase 介紹• ASF 的 top-level 專案• 屬於 noSQL DB 中的 Key-Value 類型• 源自於 Google 的• Bigtable: A Distributed Storage System for Structured Data• a distributed storage system for managing structured data that is

designed to scale to a very large size: petabytes of data across thousands of commodity servers

• a sparse, distributed, persistent multi-dimensional sorted map

Hbase: The Definitive Guide - http://www.amazon.com/HBase-Definitive-Guide-Lars-George/dp/1449396100/ref=sr_1_1?ie=UTF8&qid=1339060175&sr=8-1

18

http://dl.acm.org/citation.cfm?id=1365816



Apache Hbase Concepts – Column-Oriented (1/2)

http://ofps.oreilly.com/titles/9781449396107/intro.html

19

http://ofps.oreilly.com/titles/9781449396107/intro.html

Apache Hbase Concepts – Column-Oriented (2/2)

• a sparse, distributed, persistent multi-dimensional sorted map• which is indexed by row key, column key (column family +

qualifiers), and a timestamp

Column Families

20

Apache Hbase Concepts - Architecture

http://ofps.oreilly.com/titles/9781449396107/architecture.html

21

http://ofps.oreilly.com/titles/9781449396107/architecture.html

Hands-on (1/3) –Use your VM (Virtual Machine) to install tm-puppet

• Please refer to SPN Dev hbase training program again~• Install git on your PC• Install tm-puppet on your VM

22

Hands-on (2/3) –Use HBase shell• Basic operations• help, list, scan

• Create• A table ‘MY_FIRST_TABLE’• Two column families ‘FAM_1’, ‘FAM_2’• Ex.

• create 't1', {NAME => 'f1'}, {NAME => 'f2'}• Create ‘t1’, ‘f1’, ‘f2’

• Put two records (column)• Ex. put 't1', 'r1', 'c1', 'value'

• Update a record (column) (It is also a put)• Delete a record (column)• delete 't1', 'r1', 'c1'

23

Hands-on (3/3) –Requirements• Put your successful installed tm-puppet image file to git• Use following commands

• Jps• Ifconfig

• Cut the image• Path : ${git_home}/hbase-training/001/hands-on/${your_name}/hands-on-001.jpg

• Put your hbase shell records image file to git• Use following commands

• Scan ‘MY_TEST_TABLE’ • Ifconfig

• Cut the image• Path : ${git_home}/ hbase-training/001/hands-on/${your_name}/hands-on-002.jpg

• Commit and push your git

24

noSQL architecture practices (1/8) – Use noSQL as complement• Use noSQL as a mirror (implemented by code)• The RDB is still a major storage device, and noSQL as a mirror

NoSQL 架構實踐（一）— 以 NoSQL為輔 ─ http://www.infoq.com/cn/news/2011/02/nosql-architecture-practice

25

http://www.infoq.com/cn/news/2011/02/nosql-architecture-practice

noSQL architecture practices (2/8) – Use noSQL as complement

//PSEUDO CODE for noSQL as a mirror//We want to store the data Object bool status = false; DB.startTransaction(); //start transactionid = DB.Insert(data); //write data Object to RDBif(id > 0){ status = NoSQL.Add(id, data); //write data Object to noSQL by id } if(id > 0 && status == true){ DB.commit(); //commit transaction } else { DB.rollback(); //failed, rollback transaction }

26

• Use noSQL as a mirror (implemented by synchronization)


27

• Combine RDB & noSQL


28


//PSEUDO CODE for RDB & noSQL　 combination //we want to store the data Object data.title　 = "title"; data.name = "name"; data.time = "2009-12-01 10:10:01";data.from = "1";bool status = false; DB.startTransaction(); //start transaction //write into RDB, data.from is a value needed by search criteriaid = DB.Insert("INSERT INTO table (from) VALUES(data.from)"); if(id > 0){ //write data Object to noSQL by id status = NoSQL.Add(id, data); } if(id>0 && status==true){ DB.commit(); //commit transaction }else{ DB.rollback(); //failed, rollback transaction }

29

• What benefits we can get from the RDB & noSQL combination practice

• Decrease the I/O of RDB, therefore save more storage space• Increase the RDB table-level cache hitrate, only the key

values(PK, FK, search criteria related values) updated will refresh the cache

• Increase the synchronization efficiency for RDB Master/Slave architecture

• Increase the RDB backup/recover efficiency• Increase the scalability/performance for whole system


30

• Use only with noSQL• Mainly for simple query requirements systems• But there are noSQL products can fulfill the more complex

queries• MonngoDB, Tokyo Cabinet, etc

noSQL architecture practices (7/8) – Use noSQL as master

NoSQL 架構實踐（二）— 以 NoSQL為主 ─ http://www.infoq.com/cn/news/2011/03/nosql-architecture-practice-2

31

http://www.infoq.com/cn/news/2011/03/nosql-architecture-practice-2

• Use noSQL as major data source• APs only write data into noSQL• Then synchronize the data from noSQL to other data stores

based on their application

noSQL architecture practices (8/8) – Use noSQL as master

32

Case Study (1/4) – Facebook’s Real-time Message System

• Use HBase to store 135+ billion messages a month• Beat off other few competitors such as Cassandra, mySQL-

Sharding, etc

• Data Patterns• A short set of temporal data that tends to be volatile• An ever-growing set of data that rarely gets accessed

Facebook's New Real-time Messaging System: HBase to Store 135+ Billion Messages a Month - http://highscalability.com/blog/2010/11/16/facebooks-new-real-time-messaging-system-hbase-to-store-135.html

33

http://highscalability.com/blog/2010/11/16/facebooks-new-real-time-messaging-system-hbase-to-store-135.html



• Some key aspects of their system:• HBase

• Has a simpler consistency model than Cassandra.• Very good scalability and performance for their data patterns.• Most feature rich for their requirements: auto load balancing and

failover, compression support, multiple shards per server, etc.• HDFS, the filesystem used by HBase, supports replication, end-to-end

checksums, and automatic rebalancing.• Facebook's operational teams have a lot of experience using HDFS

because Facebook is a big user of Hadoop and Hadoop uses HDFS as its distributed file system.


34

• Haystack is used to store attachments.• A custom application server was written from scratch in order

to service the massive inflows of messages from many different sources.

• A user discovery service was written on top of ZooKeeper.• Infrastructure services are accessed for: email account

verification, friend relationships, privacy decisions, and delivery decisions

• Keeping with their small teams doing amazing things approach, 20 new infrastructures services are being released by 15 engineers in one year.

• Facebook is not going to standardize on a single database platform, they will use separate platforms for separate tasks.


35

Case Study (4/4) – Alibaba China Site architecture

http://www.infoq.com/cn/presentations/hl-alibaba-cn-architecture-design-practice

36

http://www.infoq.com/cn/presentations/hl-alibaba-cn-architecture-design-practice

37

Data Access pattern as the key for noSQL• Data Structure• Structured• Semi-structured• Unstructured• Size

• How many & how often writes/read (proportion)• Data Writing• Transaction

• Data Reading• Random access• Sequential access• Relationship 38

Q & A

39

Technology

001 hbase introduction