Scaling PostgreSQL on Postgres-XL - PgConf.Russia · Postgres-XL ! Rich SQL support ... Postgres-XL Technology • GTM maintains a consistent view of the data . 30 Standard PostgreSQL

1 1

Scaling PostgreSQL on Postgres-XL

February, 2015 Mason Sharp PGConf.ru

2

Agenda

•  Intro •  Postgres-XL Background

•  Architecture Overview

•  Distributing Tables

•  Configuring a Local Cluster

•  Differences to PostgreSQL •  Community

3

whoami

•  Co-organizer of NYC PUG •  I like database clusters

•  GridSQL •  Postgres-XC •  Postgres-XL

•  Work: •  TransLattice •  StormDB •  EnterpriseDB

4

PGConf.US

•  March 25–27, 2015

•  New York City

•  44 Talks

•  4 Tracks!

7

Postgres-XL

n  Open Source

n  Scale-out RDBMS –  MPP for OLAP

–  OLTP write and read scalability

–  Custer-wide ACID properties

–  Multi-tenant security

–  Can also act as distributed key-value store

8

Postgres-XL

n  Hybrid Transactional and Analytical Processing (HTAP)

n  Can replace multiple systems

9

Postgres-XL

n  Rich SQL support

–  Correlated Subqueries

10

13

Postgres-XL Features

•  Large subset of PostgreSQL SQL •  Distributed Multi-version Concurrency Control

•  Contrib-friendly

•  Full Text Search

•  JSON and XML Support

•  Distributed Key-Value Store

14

Postgres-XL Missing Features

•  Built-in High Availability •  Use external like Corosync/Pacemaker •  Area for future improvement

•  “Easy” to re-shard / add nodes •  Some pieces there •  Tables locked during redistribution •  Can however “pre-shard” multiple nodes on

the same server or VM

•  Some FK and UNIQUE constraints •  Recursive CTEs (WITH)

15

Performance

Transactional work loads

10 50 100 200 300 4000

2000

4000

6000

8000

10000

12000

14000

16000

18000

Dell DVD Benchmark

StormDBAmazon RDS

16

OLTP Characteristics

n  Coordinator Layer adds about 30% overhead to DBT-1 (TPC-W) workload –  Single node (4 cores) gets 70% performance compared

to vanilla PostgreSQL

n  If linear scaling 10 nodes should achieve 7x performance compared to PostgreSQL

n  Actual result: 6 – 6.4 x

n  More nodes, more open transactions, means larger snapshots and more visibility checking

17

MPP Performance – DBT-3 (TPC-H)

Postgres-XL

PostgreSQL

Postgres-XL

18

OLAP Characteristics

n  Depends on schema n  Star schema?

–  Distribute fact tables across nodes –  Replicate dimensions

n  Linear/near-linear query performance for straight-forward queries

n  Other schemes different results –  Sometimes “better than linear” due to more caching

19

Key-value Store Comparison

Postgres-XL

MongoDB

21

Multiversion Concurrency

Control (MVCC)

22 July 12th, 2012 Postgres-XC 22

MVCC

l  Readers do not block writers l  Writers do not block readers l  Transaction Ids (XIDs)

l  Every transaction gets an ID

l  Snapshots contain a list of running XIDs


MVCC – Regular PostgreSQL

Example:

T1 Begin...

T2 Begin; INSERT...; Commit;

T3 Begin...

T4 Begin; SELECT

l  T4's snapshot contains T1 and T3

l  T2 already committed l  T4 can see T2's commits, but not T1's nor T3's

Node


Example:

T1 Begin...

T2 Begin; INSERT...; Commit;

T3 Begin...

T4 Begin; SELECT

l  Node 1: T2 Commit, T4 SELECT l  Node 2: T4 SELECT, T2 Commit

l  T4's SELECT statement returns inconsistent data l  Includes data from Node1, but not Node2.

MVCC – Multiple Node Issues

Coord

N1 N2


Postgres-XL Solution: Ø  Make XIDs and Snapshots cluster-wide

MVCC – Multiple Node Issues

26

Postgres-XL Technology

•  Users connect to any Coordinator

27


•  Data is automatically spread across cluster

28


•  Queries return back data from all nodes in parallel

29


•  GTM maintains a consistent view of the data

30

Standard PostgreSQL

Parser

Planner

Executor

MVCC and Transaction Handling

Storage

31

Standard PostgreSQL

Parser

Planner

Executor


Storage

GTM

MVCC and Transaction Handling •  Transaction ID (XID) Generation •  Snapshot Handling

32

Standard PostgreSQL

Parser

Planner

Executor


Storage

XL Coordinator Parser

Planner

Executor

“Metadata” Storage

XL Datanode Executor

Storage

33

Postgres-XL Architecture

•  Based on the Postgres-XC project •  Coordinators •  Connection point for applications •  Parsing and planning of queries

•  Data Nodes •  Actual data storage •  Local execution of queries

•  Global Transaction Manager (GTM) •  Provides a consistent view to transactions •  GTM Proxy to increase performance

34

Coordinators

•  Handles network connectivity to the client •  Parse and plan statements

•  Perform final query processing of intermediate result sets

•  Manages two-phase commit

•  Stores global catalog information

35

Data Nodes

•  Stores tables and indexes •  Only coordinators connects to data nodes •  Executes queries from the coordinators •  Communicates with peer data nodes for

distributed joins

36

Global Transaction Manager (GTM)

•  Handles necessary MVCC tasks •  Transaction IDs •  Snapshots

•  Manages cluster wide values •  Timestamps •  Sequences

•  GTM Standby can be configured

37


•  Can this be a single point of failure?

38



Yes.

39



Yes.

Solution: GTM Standby

GTM GTM Standby

40


•  Can GTM be a bottleneck?

41



Yes.

42



Yes.

Solution: GTM Proxy

GTM Proxy Coord Proc 1

Coord Proc 2

Coord Proc 3

DN Proc 1

DN Proc 2

GTM

43

GTM Proxy

•  Runs alongside Coordinator or Datanode •  Backends use it instead of GTM

•  Groups requests together

•  Obtain range of transaction ids (XIDs)

•  Obtain snapshot


Coord Proc 2

Coord Proc 3

DN Proc 1

DN Proc 2

GTM

44

GTM Proxy

•  Example: 10 process request a transaction id

•  They each connect to local GTM Proxy

•  GTM Proxy sends one request to GTM for 10 XIDs

•  GTM locks procarray, allocates 10 XIDs

•  GTM returns range

•  GTM Proxy demuxes to processes


Coord Proc 2

Coord Proc 3

DN Proc 1

DN Proc 2

GTM

46

Data Distribution – Replicated Tables

•  Good for read only and read mainly tables

•  Sometimes good for dimension tables in data warehousing

•  If coordinator and datanode are colocated, local read occurs

•  Bad for write-heavy tables

•  Each row is replicated to all data nodes

•  Statement based replication

•  “Primary” avoids deadlocks

47

Data Distribution – Distributed Tables

•  Good for write-heavy tables

•  Good for fact tables in data warehousing

•  Each row is stored on one data node

•  Available strategies •  Hash •  Round Robin •  Modulo

49

Availability

•  No Single Point of Failure

•  Global Transaction Manager Standby

•  Coordinators are load balanced

•  Streaming replication for data nodes

•  But, currently manual…

50

DDL

51

Defining Tables

CREATE TABLE my_table (…)

DISTRIBUTE BY

HASH(col) | MODULO(col) | ROUNDROBIN | REPLICATION

[ TO NODE (nodename[,nodename…])]

52

Defining Tables

CREATE TABLE state_code (

state_code CHAR(2),

state_name VARCHAR,

:

) DISTRIBUTE BY REPLICATION

An exact replica on each node

53

Defining Tables

CREATE TABLE invoice (

invoice_id SERIAL,

cust_id INT,

:

) DISTRIBUTE BY HASH(invoice_id)

Distributed evenly amongst sharded nodes

54

Defining Tables

CREATE TABLE invoice (

invoice_id SERIAL,

cust_id INT ….

) DISTRIBUTE BY HASH(invoice_id);

CREATE TABLE lineitem (

lineitem_id SERIAL,

invoice_id INT …

) DISTRIBUTE BY HASH(invoice_id);

Joins on invoice_id can be pushed down to the datanodes

55

test2=# create table t1 (col1 int, col2 int) distribute by hash(col1); test2=# create table t2 (col3 int, col4 int) distribute by hash(col3);

56

explain select * from t1 inner join t2 on col1 = col3; Remote Subquery Scan on all (datanode_1,datanode_2) -> Hash Join Hash Cond: -> Seq Scan on t1 -> Hash -> Seq Scan on t2

Join push-down. Looks much like regular PostgreSQL

57

test2=# explain select * from t1 inner join t2 on col2 = col3; Remote Subquery Scan on all (datanode_1,datanode_2) -> Hash Join Hash Cond: -> Remote Subquery Scan on all (datanode_1,datanode_2) Distribute results by H: col2 -> Seq Scan on t1 -> Hash -> Seq Scan on t2

Will read t1.col2 once and put in shared queue for consumption for other datanodes for joining. Datanode-datanode communication and parallelism

60

By the way

n  PostgreSQL does not have query parallelism – yet

61

By the way

n  PostgreSQL does not have query parallelism – yet n  Example: 16 core box, 1 user, 1 query

–  Just 1 core will be used –  15 (or so) idle cores

62

By the way

n  PostgreSQL does not have query parallelism – yet n  Example: 16 core box, 1 user, 1 query

–  Just 1 core will be used –  15 (or so) idle cores

n  You can use Postgres-XL even on a single server and configure multiple datanodes to achieve better performance thanks to parallelism

63

64

Configuration

65

66

Use pgxc_ctl!

(demo in a few minutes)

(please get from git HEAD for latest bug fixes)

67

Otherwise, PostgreSQL-like steps apply:

initgtm initdb

68

postgresql.conf

•  Very similar to regular PostgreSQL •  But there are extra parameters

•  max_pool_size •  min_pool_size •  pool_maintenance _timeout •  remote_query_cost •  network_byte_cost •  sequence_range •  pooler_port •  gtm_host, gtm_port •  shared_queues •  shared_queue_size

69

Install

n  Download RPMs – http://sourceforge.net/projects/postgres-xl/

files/Releases/Version_9.2rc/

Or

n  configure; make; make install

70

Setup Environment

•  .bashrc •  ssh used, so ad installation bin dir to

PATH

71

Avoid password with pgxc_ctl

•  ssh-keygen –t rsa (in ~/.ssh) •  For local test, no need to copy key,

already there •  cat ~/.ssh/id_rsa.pub >> ~/.ssh/

authorized_keys

72

pgxc_ctl

•  Initialize cluster •  Start/stop •  Deploy to node •  Add coordinator •  Add data node / slave •  Add GTM slave

73

pgxc_ctl.conf

•  Let’s create a local test cluster! •  One GTM •  One Coordinator •  Two Datanodes

Make sure all are using different ports •  including pooler_port

74

pgxc_ctl.conf

pgxcOwner=pgxl pgxcUser=$pgxcOwner

pgxcInstallDir=/usr/postgres-xl-9.2

75

pgxc_ctl.conf

gtmMasterDir=/data/gtm_master gtmMasterServer=localhost

gtmSlave=n

gtmProxy=n

76

pgxc_ctl.conf

coordMasterDir=/data/coord_master coordNames=(coord1)

coordPorts=(5432)

poolerPorts=(20010)

coordMasterServers=(localhost)

coordMasterDirs=($coordMasterDir/coord1)

77

pgxc_ctl.conf

coordMaxWALSenders=($coordMaxWALsender)

coordSlave=n

coordSpecificExtraConfig=(none)

coordSpecificExtraPgHba=(none)

78

pgxc_ctl.conf

datanodeNames=(dn1 dn2) datanodeMasterDir=/data/dn_master

datanodePorts=(5433 5434)

datanodePoolerPorts=(20011 20012)

datanodeMasterServers=(localhost localhost)

datanodeMasterDirs=($datanodeMasterDir/dn1 $datanodeMasterDir/dn2)

79

pgxc_ctl.conf

datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)

datanodeSlave=n

primaryDatanode=dn1

80

pgxc_ctl

pgxc_ctl init all

Now have a working cluster

82

PostgreSQL Differences

83

pg_catalog

•  pgxc_node •  Coordinator and Datanode definition •  Currently not GTM

•  pgxc_class •  Table replication or distribution info

84

Additional Commands

•  PAUSE CLUSTER / UNPAUSE CLUSTER •  Wait for current transactions to complete •  Prevent new commands until UNPAUSE

•  EXECUTE DIRECT ON (nodename) ‘command’

•  CLEAN CONNECTION •  Cleans connection pool

•  CREATE NODE / ALTER NODE

85

Noteworthy

•  SELECT pgxc_pool_reload() •  pgxc_clean for 2PC

86

Looking at Postgres-XL Code

n  XC #ifdef PGXC n  XL #ifdef XCP #ifndef XCP

87

Community

88

Website and Code

n  postgres-xl.org n  sourceforge.net/projects/postgres-xl

n  Will add mirror for github.com

n  Docs –  files.postgres-xl.org/documentation/index.html

90

Philosophy

n  Stability and bug fixes over features n  Performance-focused

n  Open and inclusive –  Welcome input for roadmap, priorities and design –  Welcome others to become core members and

contributors

91

Roadmap

n  New release n  Improve Cluster Management

n  Catch up to PostgreSQL Community -> 9.3, 9.4

n  Make XL a more robust analytical platform –  Distributed Foreign Data Wrappers

n  GTM optional

n  Native High Availability

92

Thank You!

[email protected] @mason_db

Documents

Scaling PostgreSQL on Postgres-XL - PgConf.Russia · Postgres-XL ! Rich SQL support ... Postgres-XL Technology • GTM maintains a consistent view of the data . 30 Standard PostgreSQL