44
SATA, SAS, SSD, CAM, GEOM, ... The Block Storage Subsystem in FreeBSD Alexander Motin <[email protected]> iXsystems, Inc. EuroBSDCon 2013

SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

  • Upload
    lyhanh

  • View
    218

  • Download
    5

Embed Size (px)

Citation preview

Page 1: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

SATA, SAS, SSD, CAM, GEOM, ...The Block Storage Subsystem in FreeBSD

Alexander Motin <[email protected]>iXsystems, Inc.

EuroBSDCon 2013

Page 2: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

«A long time ago» … in our own galaxy …appeared block storages ...

FreeBSD 3: struct cdevsw

FreeBSD 4: struct cdevsw + early disk(9) KPI

FreeBSD 5: disk(9) KPI + GEOM

Page 3: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Block storage above disk(9)

● Data operations:

– Read– Write

● Properties

– Block size– Capacity

Page 4: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Block storage KPI

● Data operations:

– Read– Write

● Properties

– Block size– Capacity

● start(struct bio *)

– BIO_READ– BIO_WRITE

– sectorsize– mediasize

Page 5: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Removable block storage

● Media lock/notify● Data operations:

– Read– Write

● Properties

– Block size– Capacity

● access(), spoiled()● start(struct bio *)

– BIO_READ– BIO_WRITE

– sectorsize– mediasize

Page 6: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Write-caching block storage

● Media lock/notify● Data operations:

– Read– Write– Cache flush

● Properties

– Block size– Capacity

● access(), spoiled()● start(struct bio *)

– BIO_READ– BIO_WRITE– BIO_FLUSH

– sectorsize– mediasize

Page 7: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Thin-provisioned block storage

● Media lock/notify● Data operations:

– Read– Write– Cache flush– Unmap / Trim

● Properties

– Block size– Capacity

● access(), spoiled()● start(struct bio *)

– BIO_READ– BIO_WRITE– BIO_FLUSH– BIO_DELETE

– sectorsize– mediasize

Page 8: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Addtional attributes● Media lock/notify

● Data operations:

– Read

– Write

– Cache flush

– Unmap / Trim● Properties

– Block size

– Capacity

– C/H/S, physical sector size, serial number, ...

● access(), spoiled()

● start(struct bio *)

– BIO_READ

– BIO_WRITE

– BIO_FLUSH

– BIO_DELETE

– sectorsize

– mediasize

– stripesize, stripeoffset, BIO_GETATTR

Page 9: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

From one layer to many – GEOM

Block storage KPI

Block storage KPI

Block storage KPI

Page 10: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

GEOM topology

DEV «ada0»

DISK «ada0»

ada0

ATA HDD

/dev/ada0

Consumer

Provider

Geom

Geom

Page 11: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Mounted UFS in GEOM

DEV «ada0»

DISK «ada0»

ada0

ATA HDD

/dev/ada0

VFS «ada0»

/mnt/...

Page 12: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Disk partitioning in GEOM

DEV «ada0»

DISK «ada0»

ada0

ATA HDD

/dev/ada0

PART «ada0»

ada0s1 ada0s2

DEV «ada0s1»

/dev/ada0s1

DEV «ada0s1»

/dev/ada0s2

Page 13: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Cascaded disk partitioning

DEV «ada0»

DISK «ada0»

ada0

PART «ada0»

ada0s1 ada0s2

DEV «ada0s1a» DEV «ada0s1b»

PART «ada0s1»

ada0s1a ada0s1b

DEV «ada0s2»

Page 14: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

GEOM functionality

● Tasting● Orphanization● Spoiling● Configuration

● I/O procesing

Page 15: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

GEOM in threads

● Tasting● Orphanization● Spoiling● Configuration

● I/O submission● I/O completion

g_event

g_down

g_up

Page 16: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

GEOM calls and threads

Application g_up

g_gown

g_event

d_open()/d_close()

d_strategy()

g_io_request()

Disk

d_strategy()

biodone()

g_io_deliver()

d_open()/d_close()

biodone()

g_access()

struct cdevswstruct cdevsw

struct diskstruct disk

Page 17: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Block storages below disk(9)

● SCSI disks/CD/DVD● ATA/ATAPI disks/CD/DVD● MMC/SD cards● NAND flash● Proprietary block devices:

– nvme(4)/nvd(4)

– mfi(4)

– aac(4)

– ...

Page 18: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

ATA/SCSI block devices before 9.0

ATA – ata(4)● ad: disk(9) → ATA● afd: disk(9) → SCSI● acd: disk(9) → SCSI● atapicam: wrapper● ATA bus● ATA command queue● ATA HBA drivers

SCSI – CAM

● da: disk(9) → SCSI● cd: disk(9) → SCSI

● SPI bus● SCSI command queue● SCSI HBA drivers

Page 19: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

ATA/SCSI block devices after 9.0

CAM handling both ATA and SCSI● ada: disk(9) → ATA● da: disk(9) → SCSI● cd: disk(9) → SCSI● Virtualized bus: ATA, SATA, SPI, SAS, ...● Unified ATA/SCSI command queue● Unified ATA/SCSI HBA drivers

Page 20: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Unified diversityLSI SAS HBA SES in LSI SAS Expander4 Intel SATA SSDs

Marvell AHCI SATA HBA

SES in SATA backplane (via PMP I2C)

4 Intel SATA SSDs

Silicon Image Port Multiplier

Page 21: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Back to a wider view

Disk(9) KPI

GEOM

Disk 1 Disk 2 Disk 3 Disk 4

Page 22: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Disk multipath

● 2+ SAS HBAs + dual-expander JBOD + SAS disks;● 2+ FC HBAs + storage with several FC ports;● iSCSI initiator and target with 2+ NICs each;● ...

=● Improved reliability● Improved performance

Storage

Host

HBA HBA

Page 23: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS
Page 24: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Disk multipath in GEOM

DEV «da0»

DISK «da0»

da0

SAS HDD

/dev/da0

MULTIPATH «disk0»

multipath/disk0

DEV «multipath/disk0»

/dev/multipath/disk0

DISK «da1»

da1

DEV «da1»

/dev/da1

Page 25: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS
Page 26: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

BIOS-assisted «Fake» RAID

Page 27: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS
Page 28: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

BIOS-assisted RAID in GEOM

DEV «ada0»

DISK «ada0»

ada0

SATA HDD

/dev/ada0

RAID «Intel-6eca044e»

raid/r0

DEV «raid/r0»

/dev/raid/r0

DISK «ada1»

ada1

DEV «ada1»

/dev/ada1

SATA HDD

raid/r1

DEV «raid/r1»

/dev/raid/r1

Page 29: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS
Page 30: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

BIOS-assisted RAID in GEOM

Page 31: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Test setup:● 4 LSI 6Gbps SAS HBAs● 16 6Gbps SATA SSDs● Platform 1:

● Intel Core i7-3930K, 6x2 cores @ 3.2GHz● ASUS P9X79 WS

● Platform 2:● 2x Intel Xeon E5645, 2x6x2 cores @ 2.4GHz● Supermicro X8DTU

Test: Total number of IOPS from many instances of`dd if=/dev/daX of=/dev/null bs=512`

Is GEOM fast?

Page 32: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Platform 1: Core i7-3930K 3.2GHz

4 SSD 8 SSD 12 SSD 16 SSD0

100000

200000

300000

400000

500000

600000

700000

800000

Page 33: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Platform 2: 2xXeon E5645 2.4GHz

4 SSD 8 SSD 12 SSD 16 SSD0

50000

100000

150000

200000

250000

300000

350000

400000

450000

Page 34: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Can GEOM be made faster? Yes!

Bottlenecks

Page 35: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Bottlenecks:● 5 threads and up to 10 swiches per request:

dd, g_down, HBA HWI, CAM SWI, g_up● GEOM threads are capped at 100% CPU● Congested per-HBA locks in CAM

Solutions:● Direct dispatch in GEOM● Improved CAM locking● More completion threads or direct dispatch in CAM

Can GEOM be made faster? Yes!

Page 36: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Requirements:● Caller should not hold any locks● Caller should be reenterable● Callee should not depend on g_up / g_down threads semantics● Kernel thread stack should not overflow

Implementation:● Per-consumer/-provider flags to declare caller and callee capabilities● Kernel thread stack usage estimation

Direct dispatch in GEOM

Page 37: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Direct dispatch in GEOM

Application

g_event

d_open()/d_close()

d_strategy()

Disk

d_strategy()

biodone()

d_open()/d_close()

biodone()

g_access()

struct cdevswstruct cdevsw

struct diskstruct disk

g_io_deliver()

g_io_request()

Page 38: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Before:● Per-SIM locks protect everything for one SIM (HBA) from periph drivers state to HBA hardware access

After:● Per-SIM locks protect only HBA, keeping KPI/KBI● Queue locks protect CCB queues and serialise SIM calls to reduce SIM locks congestions● Per-bus locks protect reference counting● Per-target locks protect list of LUNs● Per-LUN locks protect device and periph

Improved CAM locking

Page 39: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Improved CAM locking

SIM

Queue

Bus

Target

Device

Periph

Target

Periph

Device

Periph Periph

SIM

Bus

Target

Device

Periph

Target

Periph

Device

Periph Periph

Done queue Queue Done queue

Page 40: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

4 SSD 8 SSD 12 SSD 16 SSD0

200000

400000

600000

800000

1000000

1200000

head

done

WIP

Platform 1: Core i7-3930K 3.2GHz

Page 41: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

4 SSD 8 SSD 12 SSD 16 SSD0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

head

done

WIP

Platform 2: 2xXeon E5645 2.4GHz

Page 42: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Can we do even more? Possibly!

Bottlenecks

Context switches

Page 43: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

Multiple queues/IRQs support

Page 44: SATA, SAS, SSD, CAM, GEOM, The Block Storage ...mav/disk.pdf4 LSI 6Gbps SAS HBAs 16 6Gbps SATA SSDs Platform 1: Intel Core i7-3930K, 6x2 cores @ 3.2GHz ASUS P9X79 WS

● Commit the CAM and GEOM changes.

● Add multiple queues support to HBA drivers.

● File systems, schedulers and other places outside block storage also need work to keep up. Join!

Questions?

Work In Progress