40
APPLICATION OPTIMISATION Flash’s Final Frontier

Application Acceleration - Flash's Final Frontier

  • Upload
    sandisk

  • View
    690

  • Download
    3

Embed Size (px)

DESCRIPTION

This was a presentation given by Thomas Kejser, EMEA CTO of Fusion-io, during IPExpo 2012 in London. You can see a recording of the presentation here: http://fio.cc/QKjZxK Flash memory solutions are quickly moving from innovative new technology to a crucial building block in today’s data centers. Leading flash memory platforms are optimised to leave all the disk era code behind in the disk era where it belongs. Less disk-driven code means less latency and more performance, which is why companies across Europe are adopting flash as a new high performance memory tier for their servers. Few would argue that flash is rapidly replacing disk for performance, but flash in the server is only half the battle. The next wave of flash memory innovation will follow as application developers move on from coding apps for disk and start to integrate flash memory optimisation into their software. Big Data applications are among the first to make this transition, with many other software developers in line to come next. In this talk, Fusion-io EMEA CTO Thomas Kesjer will explore flash as a memory tier and how application optimisation will lead to the next wave of innovation in the flash memory revolution.

Citation preview

Page 1: Application Acceleration - Flash's Final Frontier

APPLICATION OPTIMISATION Flash’s Final Frontier

Page 2: Application Acceleration - Flash's Final Frontier

AGENDA

Where were we? Where are we? Where will be go from here?

October 25, 2012 2

Page 3: Application Acceleration - Flash's Final Frontier

ONCE UPON A TIME…

October 25, 2012 3

CPU

Where data is needed

Where data Is stored

Page 4: Application Acceleration - Flash's Final Frontier

DOES TECHNOLOGY ALWAYS ADVANCE?

October 25, 2012 4

CPU

Where data is needed

Where data is stored

Page 5: Application Acceleration - Flash's Final Frontier

ENTER: FLASH TECHNOLOGY

5

CPU

CPU

CPU

Page 6: Application Acceleration - Flash's Final Frontier

TECHNOLOGY PROGRESSES TO A POINT…

Fusion Devices: •  1 Billion IOPS aggregate •  Millions of IOPS on single PCI slot •  Capacity in the 10s of TB / Server Flash will get: •  Commoditized •  Cheaper •  Faster •  (Somewhat) denser October 25, 2012 6

Time

Page 7: Application Acceleration - Flash's Final Frontier

DESIGN HABITS

Page 8: Application Acceleration - Flash's Final Frontier

STORAGE BECAME…

Highly Aggregated Tunable (storage experts) Layered Complex

October 25, 2012 8

Sequential

Random

Page 9: Application Acceleration - Flash's Final Frontier

PROGRAMMERS ADAPTED…

Sync Model

October 25, 2012 9

Async Model

STOP Queue

Problem: Context Switching

Problem: Over Subscription

Page 10: Application Acceleration - Flash's Final Frontier

FILE SYSTEMS ADAPTED…

October 25, 2012 10

File System

Kernel Block Layer

Device Driver

Sector Mapping

Met

a da

ta OS

Problems: Double Work CPU File System Overhead

Page 11: Application Acceleration - Flash's Final Frontier

DATABASES ADAPTED…

Page A

Page B

Page C

11

Page A

Page B

Page C

Page A

Page B

Page C

RAM

1

2

3

ACID Change

Log Write (sequential)

Flush/Checkpoint (Random)

Problems: Double Write Defragmentation

Page 12: Application Acceleration - Flash's Final Frontier

SIMPLIFICATION

Page 13: Application Acceleration - Flash's Final Frontier

MAKING USE OF THE NEW MEDIA

Existing I/O paradigm: •  open() •  read() •  write() •  seek() •  close() New Atomic Extensions: •  nvm_vectored_write()

October 25, 2012 13

Page 14: Application Acceleration - Flash's Final Frontier

EXAMPLE: ATOMIC I/O PRIMITIVES

October 25, 2012 14

NVM Translation Layer

iov[0] iov[1] iov[2] iov[3] iov[4]

LBA 7 + range

LBA 24 + range

LBA 42 + range

LBA 68 + range

LBA 24 + range

LBA 7 + range

iov[0] iov[1]

Application issues call to atomic I/O primitives

TRANSACTION ENVELOPES

WRITE ALL BLOCKS ATOMICALLY TRIM ALL BLOCKS ATOMICALLY WRITE AND TRIM ATOMICALLY

Page 15: Application Acceleration - Flash's Final Frontier

ATOMIC I/O PRIMITIVES BENCHMARKS (ATOMIC I/O VS NON-ATOMIC I/O)

1U HP blade server with 16 GB RAM, 8 CPU cores - Intel(R) Xeon(R) CPU X5472 @ 3.00GHz with single 1.2 TB ioDrive2 mono

Significantly more functionality with negligible performance cost

October 25, 2012 15

Page 16: Application Acceleration - Flash's Final Frontier

DIRECTFS – ELIMINATING DUPLICATE LOGIC

October 25, 2012 16

Linux VFS

Kernel Block Layer

Device Driver

Sector Mapping

Met

a da

ta

ext / btrfs / xfs DirectFS

Driver Primitives

Meta data

Page 17: Application Acceleration - Flash's Final Frontier

DIRECTFS WITH ATOMIC WRITES - ACHIEVING RAW DEVICE PERFORMANCE

October 25, 2012 17

Ban

dwid

th (M

iB/s

)

I/O Size B

andw

idth

(MiB

/s)

I/O Size Block directFS

1 Thread 8 Threads

Filesystem convenience AND atomic writes with the performance of simple writes to raw device

Page 18: Application Acceleration - Flash's Final Frontier

MAKING DATABASES RUN FASTER

Page A

Page B

Page C

Page A

Page B

Page C

Page A

Page B

Page C

RAM

ACID Change

Log Write (sequential)

Flush/Checkpoint (Random)

Page A

Page B

Page C

Page A

Page B

Page C RAM

ACID Change

Atomic Write

Page 19: Application Acceleration - Flash's Final Frontier

CASE STUDY: PERCONA SERVER (MYSQL)

Percona has added atomics support to Percona Server 5.5

▸  Removes the need of the MySQL double write buffer

▸  Ensures data integrity in case of system crashes

▸  Writes 50% less, great for flash

▸  Removes complexity from the software stack

▸  Improves both transaction bandwidth and latency

▸  Works though the directFS filesystem or on RAW devices

October 25, 2012 19

Page 20: Application Acceleration - Flash's Final Frontier

PERCONA SERVER 5.5 TPC-C BENCHMARKS

Benchmarks run by Percona with Atomic I/O and directFS pre-release

October 25, 2012 20

Percona Server on DirectFS with Atomic I/O

Non-ACID (Double-write disabled for comparison)

ACID (Standard

Double-write)

50% more transactions with same atomic durability on the same device

ACID (Atomic write

replacing Double-write)

Percona Server on ext4

Page 21: Application Acceleration - Flash's Final Frontier

DIRECTFS – BENEFITS IN ELIMINATING DUPLICATE LOGIC

October 25, 2012 21

File System Lines of Code

directFS 6879

ReiserFS 19996

ext4 25837

btrfs 51925

XFS 63230

Page 22: Application Acceleration - Flash's Final Frontier

ATOMICS AND DIRECTFS BENEFITS

October 25, 2012 22

▸  9x reduction in source code Direct access to the underlying media provides a simplified code base with file system semantics

▸  2x Flash Media Life Elimination of write ahead logging increase life span of the media

▸  +50% transaction throughput Simplifying the database write path and utilizing atomic storage primitives directly translates to increased throughput,

Page 23: Application Acceleration - Flash's Final Frontier

Revolution

Page 24: Application Acceleration - Flash's Final Frontier

DISK OR MEMORY?

Core

Core

Core

Core

L1

L1

L1

L1

L3

L2

L2

L2

L2

1ns 10ns 100ns 100us 10ms 10us

Page 25: Application Acceleration - Flash's Final Frontier

THE COMING SHIFT

As an SSD, flash accelerates applications

As direct-access Non-Volatile Memory, flash transforms software development.

October 25, 2012 25

Page 26: Application Acceleration - Flash's Final Frontier

HOW?

Where the industry is headed -

October 25, 2012 26

Developers allocate 10/100/(1000?) TBs of Non-Volatile Memory, and never do explicit I/O again.

Page 27: Application Acceleration - Flash's Final Frontier

WHY? ELIMINATING THE MISMATCH

Manipulating data structures in memory is native to software development and fast.

October 25, 2012 27

Converting in-memory data structures to block I/O for persistence is foreign and expensive.

… but in-memory data has had no persistence.

Page 28: Application Acceleration - Flash's Final Frontier

NOT JUST A BLOCK DEVICE ANYMORE…

Existing I/O paradigm: •  open(), read(), write(), seek(), close() New Atomic Extensions: •  nvm_vectored_write() Key Value Store Extensions: •  nvm_kv_open() •  kv_put() •  kv_get() •  kv_batch_*()

October 25, 2012 28

Page 29: Application Acceleration - Flash's Final Frontier

EXAMPLE: KEY-VALUE STORE API LIBRARY

October 25, 2012 29

key value 1-128B 64b-1MB

kv_put() kv_get() or kv_batch_get()

key expiration timer marks KV pair for VSL garbage collection

key hashed into sparse address

space to simplify collision

management

value returned through single I/O operation, regardless of

value size

key value key value key value

pool A pool B

pool C

kv_get_current(), kv_next()

Iterate through each KV pair in

a pool of related keys

Atomic transaction envelope

Application issues call to Key-Value Store API

key

value

NVM Translation Layer

Page 30: Application Acceleration - Flash's Final Frontier

KEY-VALUE STORE API LIBRARY BENCHMARKS (NATIVE KV GET/PUT VS. RAW READS/WRITES)

0"

20000"

40000"

60000"

80000"

100000"

120000"

140000"

0" 20" 40" 60" 80" 100" 120" 140"

GETs/s&

Threads&

Sample&Performance&5&GET&

512B"

4KB"

16KB"

64KB"

0"

20000"

40000"

60000"

80000"

100000"

120000"

0" 20" 40" 60" 80" 100" 120" 140"

PUTs/

s&

Threads&

Sample&Performance&4&PUT&

512B"

4KB"

16KB"

64KB"

0"

20000"

40000"

60000"

80000"

100000"

120000"

140000"

0" 20" 40" 60" 80"

OPS/s

&

Threads&

Performane&rela2ve&to&ioDrive&

512B"Key"GET"

1KB"FIO"READ"

0"

20000"

40000"

60000"

80000"

100000"

120000"

0" 10" 20" 30" 40" 50" 60" 70"

OPS/s

&

Threads&

Performance&rela3ve&to&ioDrive&

512B"Key"PUT"

1K2FIO"WRITE"

Sample Performance - GET

Performance relative to ioDrive

Sample Performance - PUT

Performance relative to ioDrive

1U HP blade server with 16 GB RAM, 8 CPU cores - Intel(R) Xeon(R) CPU X5472 @ 3.00GHz with single 1.2 TB ioDrive2 mono

Significantly more functionality with negligible performance cost

October 25, 2012 30

Page 31: Application Acceleration - Flash's Final Frontier

KEY-VALUE STORE API LIBRARY BENCHMARKS: (VS MEMCACHEDB)

October 25, 2012 31

Page 32: Application Acceleration - Flash's Final Frontier

KEY-VALUE STORE API LIBRARY BENEFITS

October 25, 2012 32

▸  95% performance of raw device Smarter media now natively understands a key-value I/O interface with lock-free updates, crash recovery, and no additional metadata overhead.

▸  Up to 3x capacity increase Dramatically reduces over-provisioning with coordinated garbage collection and automated key expiry.

▸  3x throughput on same SSD Early benchmarks comparing against memcached with BerkeleyDB persistence show up to 3x improvement.

Page 33: Application Acceleration - Flash's Final Frontier

OS SWAP VS. EXTENDED MEMORY

▸  Originally designed as a last resort to prevent OOM (out-of-memory) failures ▸  Never tuned for high-performance demand-paging ▸  Never tuned for multi-threaded apps ▸  Poor performance, ex. < 30 MB/sec throughput

▸  No application code changes required ▸  Designed to migrate hot pages to DRAM and cold pages to ioMemory ▸  Tuned to run natively on flash (leverages native characteristics) ▸  Tuned for multi-threaded apps ▸  10-15x throughput improvement over standard OS Swap

October 25, 2012 33

Non-Volatile Storage (Disks, SSDs, etc.)

System Memory

OS SWAP Mechanism

NV Memory (volatile usage)

System Memory

Extended Memory Mechanism

Page 34: Application Acceleration - Flash's Final Frontier

CHECKPOINTED MEMORY PERSISTENCE PATH

October 25, 2012 34

1.  Application designates virtual address space range to be checkpointed a.  Causes creation of independently-addressable linked clone of the

checkpointed address range (no data moves or copies) b.  Checkpoint appears as addressable file in the directFS

native filesystem namespace.

2.  Application can continue manipulating contents of designated virtual address range without affecting contents of persisted checkpoint file.

3.  Application can load or manipulate persisted checkpoint file at a later time.

System Memory

NV Memory Checkpointed

Memory

Page 35: Application Acceleration - Flash's Final Frontier

API SPECS POSTED AT DEVELOPER.FUSIONIO.COM

Early-access to ioMemory SDK API specs and technical documentation (limited enrollment during early-access phase) http://developer.fusionio.com

▸  Write less code to create high-performing apps

▸  Tap into performance not available with conventional I/O access to SSDs

▸  Reduce operating costs by decreasing RAM while increasing NVM

October 25, 2012 35

Direct-access to NVM is for developers whose software retrieves and stores data.

Page 36: Application Acceleration - Flash's Final Frontier

OPEN INTERFACES AND OPEN SOURCE

▸  NVM Primitives: Open Interface

▸  directFS: Open Source, POSIX Interface

▸  NVM API Libraries: Open Source, Open Interface

▸  INCITS SCSI (T10) active standards proposals:

•  SBC-4 SPC-5 Atomic-Write http://www.t10.org/cgi-bin/ac.pl?t=d&f=11-229r6.pdf

•  SBC-4 SPC-5 Scattered writes, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-086r3.pdf

•  SBC-4 SPC-5 Gathered reads, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-087r3.pdf

▸  SNIA NVM-Programming TWG active member

36 October 25, 2012

Page 37: Application Acceleration - Flash's Final Frontier

Tradi&onal  SSDs  

ioMemory™  with  Conven&onal  I/O  

ioMemory™  as  Transparent  Cache  

ioMemory™  with    direct  access  I/O  

ioMemory™  with    memory  seman&cs    

Applica&

on  

Applica&on  

Applica&

on   Applica&on   Applica&on   Applica&on   Applica&on  

User-­‐defined  I/O  API  Libraries  

User-­‐defined  Memory  API  Libraries  

OS  Block  I/O   OS  Block  I/O   OS  Block  I/O   Direct-­‐access  I/O  API  Libraries  

Memory  Seman&cs    API  Libraries  

Host  

Host  

File  System   File  System   File  System   directFS  –    NVM  filesystem  

directFS  –  NVM  filesystem  

     Block  Layer   Block  Layer   Block  Layer   I/O  Primi&ves   Memory  Primi&ves  

SAS/SATA  VSL™    

expanded  flash  transla&on  layer  

directCache™  VSL™   VSL™  

Network   VSL™  

Remote  

RAID  Controller  

Read/Write   Read/Write   Read/Write   Read/Write   CPU  Load/Store  Flash  

Transla&on  Layer  

Read/Write  

Na&ve  NVM  Access  |  

37 October 25, 2012

FLASH MEMORY EVOLUTION

Page 38: Application Acceleration - Flash's Final Frontier

CATALYST FOR TOP INDUSTRY PLAYERS TO ACCELERATE PURSUIT OF NVM PROGRAMMING

October 25, 2012 38

Page 39: Application Acceleration - Flash's Final Frontier

… AND RESONATING THROUGH THE INDUSTRY

October 25, 2012 39

Page 40: Application Acceleration - Flash's Final Frontier

T H A N K Y O U !

October 25, 2012 40