41
POSTGRESQL HIGH AVAILABILITY IN A CONTAINERIZED WORLD Jignesh Shah Chief Architect, Data Platform

PostgreSQL High Availability in a Containerized World

Embed Size (px)

Citation preview

Page 1: PostgreSQL High Availability in a Containerized World

POSTGRESQL HIGH AVAILABILITY IN A CONTAINERIZED WORLD

Jignesh Shah

Chief Architect, Data Platform

Page 2: PostgreSQL High Availability in a Containerized World

About @jkshah ü  appOrbit

•  Focus is on data management of applications running in Containers

ü  VMware •  Lead and manage Postgres and Data Management teams at VMware for various products embedding

PostgreSQL running in virtualized embedded instances

ü  Sun Microsystems •  Team Member of first published SpecJAppServer 2004 benchmark with PostgreSQL •  Performance of PostgreSQL on Solaris/Sun Servers

ü  Working with PostgreSQL community since 2005 •  http://jkshah.blogspot.com/2005/04/profiling-postgresql-using-dtrace-on_22.html

ü  Working with Container technologies (Solaris Zones) since 2004 •  http://jkshah.blogspot.com/2004/08/db2-working-under-solaris-10-zones_30.html

Page 3: PostgreSQL High Availability in a Containerized World

Agenda ü Containers ü Enterprise Needs ü PostgreSQL Replication ü Modern Projects ü Blueprint Of deployments

Page 4: PostgreSQL High Availability in a Containerized World

What are Containers? ü OS Level virtualization where kernel allows for multiple isolated user-

space instances

Operating System

Bare Metal Server

OS

Bare Metal Server

Hypervisor

OS

Operating System

Bare Metal Server

C C C C C OS

Bare Metal Server

Hypervisor

OS

C C C C

Page 5: PostgreSQL High Availability in a Containerized World

Advantages of Containers ü Lower footprint ü Very Quick Startup and Shutdown ü Density ü Nesting

Page 6: PostgreSQL High Availability in a Containerized World

Disadvantages of Containers ü Same Kernel version ü Cannot run other OS natively ü Security (to be improved) ü Not a complete solution for enterprise needs

Page 7: PostgreSQL High Availability in a Containerized World

Where to use container? ü Recreate identical environment (cookie-cutter) ü Resource Grouping of specific processes in heavily loaded server ü Handling multiple versions of software applications ü Ephemeral application instances (Dev/Test) ü Production instances (Growing everyday) ü Many more

Page 8: PostgreSQL High Availability in a Containerized World

Docker – Popular Container engine •  Installation

# sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'

[dockerrepo]

name=Docker Repository

baseurl=https://yum.dockerproject.org/repo/main/centos/7/

enabled=1

gpgcheck=1

gpgkey=https://yum.dockerproject.org/gpg

EOF

# yum install docker-engine

# systemctl enable docker.service

# systemctl start docker.service

Page 9: PostgreSQL High Availability in a Containerized World

Docker ü Quick Guide to use a docker based container

# docker run --name mycontainer –e POSTGRES_PASSWORD=mysecretpassword -d postgres

# docker exec -ti mycontainer psql -U postgres

# docker stop mycontainer

# docker rm mycontainer

# docker rmi postgres

Page 10: PostgreSQL High Availability in a Containerized World

Container Volumes ü Persists beyond the life of a Docker container

•  VOLUME command in Dockerfile or •  Using –v using docker run command •  Automatically created if not already present during docker run •  Not part of docker push/pull operations •  Can select a non-local directory using --volume-driver •  Third party components required to get multi-host support (NFS, etc )

ü Different options using –v •  -v /hostsrc/data:/opt/data:ro # for read only volumes (default rw) •  -v /hostsrc/data:/opt/data:Z # Z – private volume, z – shared volume •  -v /etc/nginx.conf:/etc/nginx.conf # for mounting a single file only

ü Volumes can be shared from another container using --volumes-from on same host

ü Starting from docker 1.9 gives first class status to Docker Volumes

Page 11: PostgreSQL High Availability in a Containerized World

PostgreSQL Container as a DB server ü Maybe you want a database server standalone

•  Not all database clients will be in the same host •  Need to limit memory usage •  Need different layout of how files are distributed

ü Use the –p option to make the port available even to non containers clients

ü Use –m to limit memory usage by the DB server (by default it can see and use all)

•  Note this does not set shared buffers automatically with the library image

docker run --name mycontainer -m 4g -e POSTGRES_PASSWORD=mysecretpassword \ -v /hostpath/pgdata:/var/lib/postgresql/data -p 5432:5432 -d postgres

Page 12: PostgreSQL High Availability in a Containerized World

PostgreSQL in an enterprise environment ü However for a real production use case we would need

•  Bigger shared memory configurations •  Need different layout of how files are distributed •  Ability to backup the database •  Ability to setup replication •  etc

ü In short we need a more custom image of PostgreSQL

Page 13: PostgreSQL High Availability in a Containerized World

Best Practices for custom image ü For production install customize the docker image

•  Allocate proper memory limits - example 8GB •  All pagecache usage shows up as docker container memory usage

•  Bump up shared buffers and other parameters as required •  Hint: use PostgreSQL 9.3 or later otherwise have to privileged containers

•  http://jkshah.blogspot.com/2015/09/is-it-privilege-to-run-container-in.html

•  Support multiple volumes in your image •  PITR archives •  Full Backup directory

•  PostgreSQL Extensions •  Setup replication support

•  Out of box replication setup

•  Monitoring Tool •  Your favorite monitoring agent

Page 14: PostgreSQL High Availability in a Containerized World

Enterprise Needs for Databases

Page 15: PostgreSQL High Availability in a Containerized World

Planning a High Availability Strategy ü Requirements

•  Recovery Time Objective (RTO) •  What does 99.99% availability really mean?

•  Recovery Point Objective (RPO) •  Zero data lost? •  HA vs. DR requirements

ü Evaluating a technology •  What’s the cost for implementing the technology? •  What’s the complexity of implementing, and managing the technology? •  What’s the downtime potential? •  What’s the data loss exposure?

Availability % Downtime / Year Downtime / Month * Downtime / week

"Two Nines" - 99% 3.65 Days 7.2 Hours 1.69 Hours"Three Nines" - 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes"Four Nines" - 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes"Five Nines" - 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds

* Using a 30 day month

Page 16: PostgreSQL High Availability in a Containerized World

Simplified View of HA PostgreSQL

ü  Easy to setup ü  Handles Infrastructure problems ü  Exploit Storage features ü  Exploit replication features

DNS Name

Applications

Somewhere in Cloud/Data Center

Page 17: PostgreSQL High Availability in a Containerized World

Causes of Downtime ü Planned Downtime

•  Software upgrade (OS patches, SQL Server cumulative updates) •  Hardware/BIOS upgrade

ü Unplanned Downtime •  Datacenter failure (natural disasters, fire) •  Server failure (failed CPU, bad network card) •  I/O subsystem failure (disk failure, controller failure) •  Software/Data corruption (application bugs, OS binary corruptions) •  User Error (shutdown a SQL service, dropped a table)

Page 18: PostgreSQL High Availability in a Containerized World

Typical Plan of action ü Minimize reasons that leads to downtime ü Faster recovery time (Balanced checkpoints) ü Proxies for fast switching between production and DR copy ü Shared Storage for HA ü PostgreSQL Synchronous Replication to go beyond

Page 19: PostgreSQL High Availability in a Containerized World

HA PostgreSQL with Shared Storage

ü  Ability to leverage hardware Snapshots/Restore ü  Automated Failover using OS Clustering Software ü  Block Level Replication for DR ü  Distributed Shared Storage getting popular

Virtual IP or DNS or pgPool or pgBouncer Applications

Site 1

Page 20: PostgreSQL High Availability in a Containerized World

PostgreSQL Replication ü Single master, multi-slave ü Cascading slave also possible ü Mechanism based on WAL (Write-Ahead Logs) ü Multiple modes and multiple recovery ways

•  Warm standby •  Asynchronous hot standby •  Synchronous hot standby

ü Slaves can perform read operations optionally •  Good for read scale

ü Node failover, reconnection possible

Page 21: PostgreSQL High Availability in a Containerized World

HA PostgreSQL with Sync Replication

ü  Synchronous Replication within Data Center ü  Low Down Time (lower than HA) ü  Automated Failover for hardware issues including Storage

Virtual IP or DNS or pgPool or pgBouncer

Applications

Site 1

Page 22: PostgreSQL High Availability in a Containerized World

PostgreSQL Replication ü  In-core replication does great replication

•  But no automated failover •  “failback” (pg_rewind – thank god) •  Load Balanced IP Address •  Get your own proxy (haproxy ?,

pgbouncer?, pgpool?) •  No-way to preserve connections

Photo Credit: dundanim/ Shutterstock.com

Page 23: PostgreSQL High Availability in a Containerized World

Just PostgreSQL? ü Need more projects

•  pgPool2 / HAProxy /pgbouncer •  Repmgr, etc

ü Some Customers at this time prefer Cloud DBaaS •  Heroku •  Amazon RDS

ü Some end up preferring Enterprise version of DBaaS •  appOrbit J

Page 24: PostgreSQL High Availability in a Containerized World

Modern HA Projects ü Patroni / Governor

•  https://github.com/zalando/patroni (Python) •  Docker container •  Etcd •  HAProxy

ü Stolon •  https://github.com/sorintlab/stolon (Golang)

•  Docker •  Etcd /Consul •  Custom Proxy

Page 25: PostgreSQL High Availability in a Containerized World

Governor

https://github.com/compose/governor/blob/master/postgres-ha.pdf

Page 26: PostgreSQL High Availability in a Containerized World

Stolon

https://github.com/sorintlab/stolon/blob/master/doc/architecture_small.png

Page 27: PostgreSQL High Availability in a Containerized World

Basic Container based HA Architecture ü Need a distributed store to store configuration status

•  Consul •  Zookeeper •  etcd

ü PostgreSQL Cluster Peer (Self Managing) •  Determines local instance status and updates configuration status •  Master regularly updates its status, failing which it is considered failed •  If master fails, election based on least lag and new leader takes over •  Other standby now follows the new master •  Potentially a third party can even provision the dead master as slave

Page 28: PostgreSQL High Availability in a Containerized World

Some New Trends in Container World ü Binaries and data often separated

•  One lives in Container image and other in Volumes ü No longer pg_xlog deployed on separate volumes

•  Underlying storage technologies leads to inconsistent point in time restore causing DB to be unusable

ü  No new table spaces •  Hard to get easy replication setups done on the fly •  Could lead to lost data if new tablespaces are not on volumes

ü  Replications setup with automation rather than manually by Admins

Page 29: PostgreSQL High Availability in a Containerized World

Some New Trends in Container World ü Adoption of Micro services

•  Leading to lots of smaller databases for each micro service ü Faster Updates

•  Schema changes sometimes need to be backward compatible ü Repeatable Deployments

•  Need to redeploy at a moment’s notice

Page 30: PostgreSQL High Availability in a Containerized World

Deployment of PostgreSQL “Cluster”

ü  Can be made self healing ü  Integrate with pg_rewind to reuse master as slave ü  Integrate with shared storage to leverage snapshot create new slaves

Virtual IP

Applications

Instance 1

Instance 2

Instance 3

Shared Storage

Page 31: PostgreSQL High Availability in a Containerized World

But Wait I have multiple DB Servers

ü  I need my clusters to dynamically grow (read scaling) ü  I also want things to auto-heal as much as it can

Applications

Page 32: PostgreSQL High Availability in a Containerized World

Kubernetes ü Production grade container orchestrator ü Horizontal scaling

•  Setup rules to scale slaves ü ConfigMap

•  postgresql.conf •  pg_hba.conf

ü Secrets •  Username passwords •  Certificates

Page 33: PostgreSQL High Availability in a Containerized World

Kubernetes ü Persistent Storage features evolving

•  Plugins for storage drivers ü External Services

•  Services are accessible from all nodes •  Shared Storage plugins makes your Stateful containers also HA •  Powerful Combination along with PostgreSQL Replication

•  can spin up fast slaves for multi-TB databases

Page 34: PostgreSQL High Availability in a Containerized World

Production Grade Orchestrator

ü  Can even add rules to spin up new slaves as for read load

Operations

Applications

Page 35: PostgreSQL High Availability in a Containerized World

But Wait .. Need to support Multi-Geo

ü  It could be DR Strategy ü  It could be Compliance requirements ü  Service Discovery now getting complicated

Operations

Applications

Page 36: PostgreSQL High Availability in a Containerized World

Consul •  Service Discovery •  Failure Detection •  Multi Data Center •  DNS Query Interface

{

"service": {

"name": ”mypostgresql",

"tags": ["master"],

"address": "127.0.0.1",

"port": 5432,

"enableTagOverride": false,

}

}

nslookup master.mypostgresql.service.domain

nslookup mypostgresql.service.domain

Page 37: PostgreSQL High Availability in a Containerized World

Service Discovery

ü  Uniform DNS name for your database ü  Cloud-agnostic naming ü  Certificates created using DNS names you own ü  No Single Point of Failures

Operations

Applications

Page 38: PostgreSQL High Availability in a Containerized World

PostgreSQL Enhancement ü SRV Record of NameServer

•  https://en.wikipedia.org/wiki/SRV_record •  IP:Port

ü PostgreSQL LIBPQ Client Enhancement •  Support Service Discovery using SRV Records •  servicename is passed •  libpq looks up the SRV Record from nameserver •  Connects port provided by SRV record

Page 39: PostgreSQL High Availability in a Containerized World

Summary ü PostgreSQL “Cluster” deployments is the wave of change ü Container is one of the technology but not the solution

Page 40: PostgreSQL High Availability in a Containerized World

Your Feedback is Important! ü We’d like to understand your use of Postgres for HA / DR.

ü  If interested, ü Twitter: @jkshah ü Email: [email protected]

Page 41: PostgreSQL High Availability in a Containerized World

Thanks. Questions?

Follow me on twitter: @jkshah

Blog: http://jkshah.blogspot.com Full copies of your applications

at the push of a button

We are HIRING !!!