Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected])
Solid State Storage Technologies
Jinkyu Jeong ([email protected])Computer Systems Laboratory
Sungkyunkwan Universityhttp://csl.skku.edu
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 2
NVMe (1)
• The industry standard interface forhigh-performance NVM storage– NVMe 1.0 specification in 2011 (now 1.3)– Supported by major OSes: Windows, Linux, Solaris, …
• PCIe-based– Low latency: direct connection to CPU– Scalable performance: 1GB/s per lane, up to 32 lanes – No HBA required: reduced power & cost
• Form factors– Add-in-Card, M.2, BGA, etc.
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 3
NVMe (2)
• Deep queue: 64K commands/queue, up to 64K queues
• Streamlined command set: only 13 required commands
• One register write to issue a command (“doorbell”)
• Support for MSI-X and interrupt aggregation
Doorbell
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 4
User-level NVMe Drivers• Ex. Intel SPDK (storage performance development kit)– All I/O operations issued in user-land– Polling or interrupt (signal to user process)
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 5
User-level NVMe Drivers
• NVMeDirect framework– User-level access +
kernel-level access
NVM
eDire
ct Li
brar
y
NVMe Controller
I/O
Ha
ndle
sI/
O
Que
ues
Block Cache
I/O Scheduler
I/O Completion Thread
Handle Handle
Admin Tool
NVMeDirectAPI
User
Kern
elHW
NVM
e Dr
iver
Defa
ult
Que
ues
User
Q
ueue
s
H.-J. Kim, Y.-S. Lee, and J.-S. Kim, “NVMeDirect: A User-space I/O Framework for Application-specific Optimization on NVMe SSDs,” HotStorage, 2016.
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 6
All-Flash Array
• Interfaces– 10Gb/40Gb Ethernet (iSCSI) or
16Gb Fibre Channel or PCIe– SAS or NVMe SSDs
• Functionalities– Volume management– Virtualization support– RAID– Snapshot– Deduplication– Compression, …
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 7
Traditional Block Interface
• SATA/SCSI/SAS– Read (sector #, length)
Write (sector #, length, data)
– No block-level liveness information– No high-level semantics on data– Several “unwritten contracts”
do not hold for SSDs• Sequential accesses are several tens of
times better than random accesses• Distant LBNs lead to longer seek times• Data written is equal to data issued• …
FTL
SSD
Host
Block device driver
File system
Block I/F
NAND FlashFlash I/F
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 8
Extending Block I/F
• TRIM command– “The data in the specified sectors is
no longer needed”– ATA interface standard
(T13 technical committee)– Non-queued command– SATA 3.1 introduces the Queued
TRIM commandFTL
SSD
Host
Block device driver
File system
NAND Flash
Block I/F + SSD-Specific I/F
Flash I/F
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 9
Atomic Write
• Transaction support for multi-block writes– Simplifies file systems and DBMSes
X. Quyang, et al., “Beyond Block I/O: Rethinking Traditional Storage Primitives,” HPCA, 2011.
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 10
Multi-streamed SSD (1)
• Previous write patterns (= current state) matter
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 11
Multi-streamed SSD (2)
• Mapping data with different lifetime to different streams
• Standardized in T10 SCSI/SAS (2015), NVMe 1.3 (2017)
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 12
Multi-streamed SSD (3)
• Cassandra with Multi-streamed SSD
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 13
Multi-streamed SSD (4)
• Cassandra’s normalized updated throughput with5 streams
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 14
OSSD: Object-based SSD (1)
• OSD (Object-based Storage Device)– Virtualizes physical storage as a pool of objects– Offloads space management to storage devices– Standardized as a subset of SCSI command set
Block interface Object interface
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 15
OSSD: Object-based SSD (2)
• OSD storage model
Application
System Call Interface
File System Storage Management
Sector/LBA Interface
Block I/O Manager
Physical Media
File System User Component
Application
System Call Interface
OSD Storage Management
OSD Interface
Block I/O Manager
Physical Media
File System User Component
Host
StorageDevice
Traditional OSD
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 16
OSSD: Object-based SSD (3)
OAQ
OFS
VFS
iSCSI Initiator
iSCSI Target Daemon
OSSD FrameworkOMLFML
FAL
Host
Target
RawSSD
READ/WRITE/ERASE SATA-2
OSD Interface (iSCSI) TCP 1Gbps
46
7
7
OID -ContextHash
Q n
Q 1
Q 0
16:8
Priority Queue
oid = 7
oid = 46
I/OContext
Object I /O instances
W
W
COAQ
AllocationBitmap
FML
OML
Descriptor
Object Attr.Object Data
μ-Tree
(Extents)
Object Data Buffer
Y. S. Lee, S.-H. Kim, J.-S. Kim, J. Lee, C. park, and S. Maeng, “OSSD: A Case for Object-based Solid State Drives,” MSST, 2013.
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 17
OSSD: Object-based SSD (4)
• Simplified host file system– No need for SSD-specific parameter tuning
• More efficient management of flash storage– Block-level liveness– Metadata separation– Object-aware storage management (allocation, dedup, ...)
• Application-aware storage management– Application hints, QoS
• Storage virtualization– Pooling, tiering, caching, backup, replication, etc.
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 18
In-Storage Computing (1)
• Samsung ISC SSD Prototype– Commodity SSD: Samsung PM1725 NVMe with the ISC
feature– PCIe 3.0x4– 800 GB
• Software– C++11– C++STL– G++– Software emulator
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 19
In-Storage Computing (2)
• ISC Application Development Process
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 20
In-Storage Computing (3)
• ISC Dataflow Programming Model
ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong ([email protected]) 21
In-Storage Computing (4)
• Example: Simple Key-Value Store